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"Production of viral resistant plants via introduction of untranslatable plus sense viral RNA 

FIELD OF THE INVENTION 

5 This invention is directed to the production of 

plants with a reduced susceptibility to virus infection. 

BACKGROUND OF THE INVENTION 

Plant viruses are responsible for major losses 
in worldwide crop production. Much effort is directed 
10 towards the development of new plant varieties which 

exhibit increased resistance to viral infection. Until 

4 

recently such efforts were primarily based on the 
traditional plant breeding approach, however this 
approach is often limited by a lack of sources of 

15 resistance within the crop species. The advent of 

modern molecular biology techniques has facilitated the 
development of new methods of rendering plant veurieties 
resistant to virus attack that are hot limited by a 
requirement for preexisting resistance genes within a 

20 species. 

Molecular Approaches 

Many of these molecular approaches are based on 
the theory of pathogen derived resistance (Sanford and 
Johnston, 1985) . This theory predicts that a "normal" 

25 host (plant) - pathogen (virus) relationship can be 
disrupted if the host organism expresses essential 
pathogen derived genes. It has been proposed that host 
organisms expressing pathogen gene products in excess 
amounts, at an inappropriate developmental stage, or in 

30 a dysfunctional form may disrupt the normal replicative 
cycle of the pathogen and result in an attenuated or . 
aborted infection of the host. 

Two approaches typify this pathogen derived 
resistance: coat protein mediated resistance and 

35 antisense RNA expression. It has been demonstrated that 
transgenic plants expressing a plant virus coat protein 
can be resistant to infection by the homologous virus. 
This coat protein mediated resistance has been 
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demons'tra'ted for several virus groups. While the 
mechanism of this resistance is not yet fully 
understood, it has been suggested that the presence of 
the plant synthesized coat protein prevents the removal 
5 of the protein coat (uncoating) of an invading virus 

and/or virus movement within the infected plant, leading 
to resistance. 

Plants which express an RNA molecule which is 
complementary to plus sense RNA species encoded by the 
10 virus may show a decreased susceptibility to infection 
by that virus. Such a complementary RNA molecule is 
termed antisense RHA. It is thought that the plant 
encoded antisense SNA binds to the viral RNA and thus 
inhibits its function. 

* 

15 Potwiruses 

The Potato Virus Y, or potyvirus, family 
represents a large number of plant viral pathogens which 
collectively can infect most crop species including both 
monocotyledonous and dicotyledonous plants. Potyvirus 

20 infection can induce a variety of symptoms including 
leaf mottling, seed and fruit distortion and cem 
severely compromise crop yield and/or quality (Hollings 
and Brunt, 1981). 

Potyviruses have a single-strand plus sense RNA 

25 of circa 10,000 nucleotides which has a viral encoded 
protein linked to the 5' end and a 3' poly adenylate 
region. A single open reading frame codes for a 351 kDa 
polyprotein which is proteolytically processed into 
mature viral gene products. The RNA is encapsidated by 

30 approximately 2,000 copies of a coat protein monomer to 
form a virion. This capsid protein is encoded by the 
sequence present at the 3' end of the large open reading 



Potyviruses can be transmitted by aphids and 
35 other sap feeding insects and in some instances can also 
be transmitted in the seeds of infected plants. 
Replication of the viral RNA is thought to occur in the 
cytoplasm of infected plant cells after uncoating. The 



replication mechanism involves both translation of the 
plus sense RNA to yield viral gene products (which 
include a replicase and a proteinase) and also the 
synthesis of a minus sense RNA streoid. This minus sense 
strand then acts as a template for the synthesis of many 
plus sense genomes which are subsequently encapsidated 
in coat protein to yield infectious mature "virions," 
thus completing the replicative cycle of the virus. 

Experiments have been reported in which 
transgenic plants expressing the coat protein gene of a 
potyvirus show a reduced susceptibility to virus 
infection (Lawson et al. 1990; Lirig et al, 1991; Stark 
and Beachy 1989) • 

SUMMARY OF THE INVENTION 

The disclosed invention concerns a method of 
producing plants with a decreased susceptibility to 
virus infection. This is achieved by transforming 
plants with a DNA molecule which includes a gene derived 
in part from the genome of a plant virus. This gene is 
specifically constructed to produce an untranslatable 
version of a plus sense RNA molecule reqpiired for viral 
replication. Thus, expression of the gene within the 
plant causes the production of this non-functional 
molecule which then inhibits viral replication within 
the plant, rendering the plant resistant to viral 
infection. 

In particular, invention provides an 
alternative and novel approach to rendering plants 
resistant to potyvirus infection. 

Plants are transformed with a gene construct 
engineered to express an untranslatable form of the plus 
sense RNA which encodes the coat protein of a potyvirus. 

In the case of Tobacco Etch Virus (TEV) , it is 
demonstrated that tobacco plants transformed with such a 
gene construct accumulate the untranslatable plus sense 
RNA but do not produce detectable levels of the coat 
protein. It is further shown that these plants are 
resistant to TEV infection. It is also shown that 



tobacco cells expressing this untranslatable plus sense 
RNA do not support TEV replication, unlike control 
tobacco cells and also unlike tobacco cells which are 
engineered to express the plus sense translatable RNA 
and which, as a result, acciomulate TEV coat protein. 
Although the exact mechanism is unknown, it is proposed 
that the untranslatable plus sense RNA inhibits viral 
replication by binding to the minus sense RNA and 
preventing the minus sense RNA from functioning in the 
replication cycle. 

It is believed that this approach will be 
applicable to other potyviruses, to genes other than the 
coat protein gene and to other plus sense RNA virus 
families. It is also believed that this, means of 
inhibiting gene function is applicable to other 
biological systems, including mammalian viruses. 

DESCRIPTION OF DRAWINGS 

Fig. 1 represents the nucleotide sequence of 
the Tobacco Etch Virus genome and its deduced amino acid 
sequence, according to Allison et al. (1986) • The 
nucleotide sequence of the plus sense strand of the DNA 
inserts is given. The first nucleotide (N) could not be 
determined unequivocally. The predicted amino acid 
sequence of the large ORE of reading frame three of the 
viron sense RNA is presented in the nucleotide sequence. 
This sequence is also set forth in SEQ ID No. 1 of the 
enclosed sequence listing. The termination codon at the 
end of the large ORE is marked with a *. The putative 
cleavage site between the large (54^000 Mv) nuclear 
inclusion protein and the capsid protein is indicated by 
the arrow. Oligonucleotide primer binding sites are 
underlined and labeled. 

Fig. 2 is a schematic representation of the 
construction of pTC:FL, utilized in construction of 
transformation vectors for the invention. Restriction 
endonuclease sites were introduced into pTL 37/8595 at 
positions A, B and C in the diagram. Following these 
nucleotide changes the mutated pTL 37/8595 was digested 
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with the restriction enzyme Ncoi, the DNA fragment 
delineated by the restriction enzyme sites at B and C 
was removed, and the plasmid religated to generate 
pTC:FL. pTC:FL contains the Tobacco Etch Virus (TEV) 

~ 5 coat protein nucleotide sequence flanked by BamKl 

restriction sites and the TEV 5' and 3' untranslated 
sequences (UTS) . T7 and SPS promoters are also shown. 
Abbreviations used in this diagrcun are as follows: TT, 
T7 RNA polymerase promoter sequence; SP6, SP6 RNA 

10 polymerase promoter sequence; ori, origin of 

replication; H13 ori, bacteriophage M13 single**stranded 
origin of replication; amp^, /J-lactamase gene. Lightly 
stippled areas are TEV 5' and 3' untranslated sequences; 
solid black area, TEV genome cDNA nucleotides 144 to 

15 200; striped area, a portion of the TEV Nib gene (TEV nt 
8462-8517); heavily stippled areas, cDNA of TEV CP 
nucleotide sequence (TEV nt 8518-9309) . 

Fig. 3 is a schematic representation of the 
forms of the Tobacco Etch Virus coat protein gene 

20 inserted into tobacco in the invention. All constructs 
contained the enhanced CaMV 35S (Enh 35S) promoter, CaMV 
35S 5' untranslated sequence (UTS) of 50 bp and the CaMV 
35S 3' UTS/polyadenylation site of llO bp. The 
nomenclature used to describe the transgenic plant lines 

25 is presented along with the gene products produced in 

those plant lines (far right column). Abbreviations are 
as follows: 35S, transgenic plants containing the CaMV 
35S promoter and 5' and 3' UTS only; FL, transgenic 
plants containing the transgene coding for full-length, 

30 AS and RC transgenic plants contain the transgene 

expressed as an antisense form of the TEV CP gene, or an 
untranslated sense form of the TEV CP gene, 
respectively. Stippled areas represent various forms of 
the TEV CP nucleotide sequence. 

35 Fig. 4 is a graphic representation of the 

appearance of systemic symptoms in plants infected with 
Tobacco Etch Virus showing responses of control plants 
and transformed plants generated as described in the 
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invention. Ten B49 (wild type) plants and ten R2 plants 
of transgenic plant lines 35S #4, FL #3, FL #24, 
homozygous for the inserted TEV gene, were mechanically 
inoculated with 50 nl of 1:10 dilution of infected plant 
5 sap (A) . Twenty B49 plants and 20 Rl plants of lines AS 
#3 and RC #5 were mechanically inoculated with 50 fil of 
5 /ig/ml TEV (B) . Plants were examined daily for the 
appearance of systemic symptoms. Plants were evaluated 
daily, and any plant displaying systemic symptoms 
10 (attenuated or wild-type) were recorded as symptomatic. 

SEQUENCE LISTING 

The attached sequence listing sets forth 
nucleotide sequences relevant to the present invention • 

SEQ ID No. 1 is the complementary DNA sequence 
15 corresponding to the Tobacco Etch Virus Genome. 

SEQ ID No. 2 is the nucleotide sequence of the 
modified Tobacco Etch Virus coat protein gene present in 
pTC:FL. 

SEQ ID No. 3 is the nucleotide sequence of the 
20 modified Tobacco Etch Virus coat protein gene present in 
pTC:RC. 

SEQ ID No. 4 is the nucleotide sequence of the 
modified Tobacco Etch Virus coat protein gene present in 
pTC:AS. It is the inverse complement of SEQ ID No. 2. 

25 DETAIIiED DESCRIPTION 

The present invention relates to genetically 
engineered plants which are transformed with a DNA 
molecule encoding an untranslatable plus sense RNA 
molecule. 
30 Definition of Terms 

Susceptible plant: A plant that supports viral 
replication and displays virus -induced symptoms. 

Resistant plant : A plant wherein virus-induced, 
symptoms are attenuated and virus replication is 
3 5 attenuated . 

Plus sense RNA (and sense RNA) : That form of 
an RNA which can serve as messenger RNA. 
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Minus sense RNA: That form of RNA used as a 
template! for plus sense RNA production. 

Antisense RNA: RNA complementary to plus sense 

RNA form. 

5 Ro generation: Primary transf ormants. 

Ri generation: Progeny of primary 



R2 generation: Second generation progeny of Rq 
generation (i.e., progeny of R^ generation). 
10 A gene derived in part from a plemt virus RNA 

molecule: At least the portion of the gene encoding the 
untranslatable RNA molecule is derived from a plant 
virus RNA molecule. 

GENERAL DESCRIPTION 

15 An untranslatable plus sense RNA molecule is 

encoded by a gene located on the DNA molecule. The gene 
comprises DNA derived from a plant virus RNA genome and 
also DNA from heterologous sources. The DNA from 
heterologous sources includes elements controlling the 

20 expression of the virus-derived DNA sequences. The DNA 
sequence of the gene is specifically altered so as to 
render the RNA molecule transcribed from the gene 
untranslatable. The presence of this untranslatable 
plus sense RNA within the cells of the transformed plant 

25 reduces the susceptibility of the plant to viral 
infection. 

More particularly, the portion of the gene 
which comprises DNA from a plant virus has been derived 
from a potyvirus. Plants transformed with the DNA 

30 molecule containing the gene are less susceptible to 
infection by potyviruses. Most specifically, the DNA 
from the potyvirus source has been derived from the coat 
protein gene of Tobacco Etch Virus and transformed 
plants are resistant to infection by Tobacco Etch Virus. 

35 Plants which can be made resistant to potyvirus 

infection include, but are not limited to, tobacco. 

Accordingly, the present invention provides a 
method for genetically engineering plants by insertion, 
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into the plant; genome, a DMA construct: containing a 
recombinant gene derived from a potyvirus genome such 
that the engineered plants display resistance to the 
potyvirus • 

5 In accordance with one aspect of the present 

invention, genetically transformed plants which are 
resistant to infection by a pleuit potyvirus are produced 
by inserting into the genome of the plant a DNA sequence 
which causes the production of an untranslatable coat 

10 protein RNA of the potyvirus. 

In accordance with cuiother aspect of the 
present invention, a DNA sequence is provided to 
function in plant cells to cause the production of an 
untranslatable plus sense HNA molecule. There has also 

15 been provided, in accordance with yet another aspect of 
the present invention, bacterial and tremsformed plant 
cells that contain the above-described DNA. In 
accordance with yet another aspect of the present 
invention, a differentiated tobacco plant has been 

20 provided that comprises transformed tobacco cells which 
express the iintreunslatable coat protein RNA of Tobacco 
Etch Virus and which plants exhibit resistance to 
infection by Tobacco Etch Virus. 

Other featoires and advantages of the present 

25 invention will become apparent from the following 

description. It should be understood, however, that the 
detailed description and the specific examples, while 
indicating preferred embodiments of the invention, are 
given by way of illustration only, since various changes 

30 and modifications within the spirit and scope of the 

invention will become apparent to those skilled in the 
art from this detailed description. 

A mechanism by which an untranslatable plus 
sense SNA molecule, such as described in the current 

35 invention can function to inhibit the normal biological 
function of a minus sense RNA molecule is proposed. One 
skilled in the art will recognize that the novel 
approach described herein is not limited to the specific 



experimental example given and will apprecialie the wider 
potential utility of the invention. 

The expression of a plant gene which exists in 
double*stranded DNA form involves transcription of 
messenger RNA (mRNA) from one strand of the DNA by RNA 
polymerase enzyme, and the subsequent processing of the 
mBNA primary transcript inside the nucleus. This 
processing involves a 3' nontranslated region which 
causes polyadenylate nucleotides to be added to the 3' 
end of the viral RNA. Tremscription of DNA into mRNA is 
regulated by a region of DNA usually referred to as the 
"promoter." The promoter region contains a sequence of 
bases that signals RNA polymerase to associate with the 
DNA and to initiate the transcription of . mRNA using one 
of the DNA strands as a template to maUce a corresponding 
strand of RNA. 

A number of promoters which are active in plant 
cells have been described in the literature. Promoters 
which are ]cnown or are found to cause trsmscription of 
viral RNA in plant cells can be used in the present 
invention. Such promoters may be obtained from plcints 
or viruses and include, but are not limited to, the CaMV 
35S promoter. As described below, it is preferred that 
the particular promoter selected should be capable of 
causing sufficient expression to result in the 
production of an effective amount of untranslatable plus 
sense RNA to render the plcuit substantially resistant to 
virus infection. The amount of untranslatable plus 
sense RNA needed to induce resistance may vary with the 
plant type. Accordingly, while the 35S promoter is 
preferred, it should be understood that this promoter 
may not be the optimal one for all exobodiments of the 
present invention. Furthermore, the promoters used in 
the DNA constructs of the invention may be modified, if 
desired, to affect their control characteristics. DNA 
sequences have been identified which confer regulatory 
specificity on promoter regions. For example, the small 
subunit of the ribulose bis-phosphate carboxylase (ss 
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ROBISCO) gene is expressed in plant leaves but not in 
root tissues. A sequence motif that represses the 
expression of the ss RUBISCO gene in the absence of 
light r to create a promoter which is active in leaves 
5 but not in root tissue, has been identified. This 

and/ or other regulatory sequence motifs may be ligated 
to promoters such as the CaMV 35S promoter to modify the 
expression patterns of a gene. Chimeric promoters so 
constructed may be used as described herein. For 

10 purposes of this description, the phrase "CaMV 35S 

promoter" will therefore include all promoters derived 
by means of ligation with operator regions, random or 
controlled mutagenesis, as well as tandem or multiple 
copies of enhancer elements, and the like. 

15 The 3*^ nontranslated region of genes which aire 

known or are found to function as polyadenylation sites 
for viral RNA in plant cells can be used in the present 
invention. Such 3' nontranslated regions include, but 
are not limited to, the 3' transcribed, nontranslated 

20 region of the CaK7 35S gene and the 3' transcribed, 
nontranslated regions containing the polyadenylation 
signals of the tumor -inducing (TI) genes of 
Agrobacteriuin , such as the tumor morphology large (tml) 
gene. For purposes of this description, the phrase 

25 "CaW 35S 3' nontranslated region" will therefore 

include all such appropriate 3^ nontranslated regions. 

The DNA constructs of the disclosed embodiment 
contain, in double-stranded DNA form, a portion of a 
cDNA version of the single-stranded RNA genome of TEV. 

30 In potyviruses, including TEV, the viral genome includes 
genes encoding the coat protein, a replicase enzyme and 
a proteinase. The disclosed embodiment utilizes the 
region of the genome encoding the coat protein gene. In 
considering the present invention and the evidence for 

35 the proposed mechanism by which an untranslatable plus 
sense RNA molecule can inhibit viral replication, those 
skilled in the art will recognize that other portions of 
a potyvirus genome could be substituted for the coat 
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protein gene. Furthermore, it will be apparent that 
suitable genomic portions are not limited to complete 
gene sequences. 

A disclosed embodiment of the invention 
5 utilizes a double-stranded complementary DNA (cDNA) 
derived from the region of the TEV genome encoding the 
coat protein gene. To the 5' end of this cDNA is 
ligated the CaHV 35S promoter and CaHV 35S RNA 5' 
nontranslated region. To the 3' end is ligated the CaMV 

10 35S 3' nontranslated region. These 5' and 3' sequences 
are present to cause transcription of the gene in plant 
cells by the cellular enzyme SNA polymerase to produce 
an RNA molecule of sequence corresponding to the 
sequence of the coat protein cDNA sequence. Ordinarily, 

15 such an RNA would then be translated by ribosomes which 
would synthesize a protein of amino acid sequence 
specified by the nucleotide sequence of the RNA 
molecule. Particular amino acids are specified by 
nucleotide triplets termed codons. Codons which 

20 stipulate trains latioii initiation and termination are 
also present in DNA and RNA seqniences. The current 
invention relates to RNA molecules which are 
untranslatable by ribosomes. In the preferred 
embodiment the sequence of the TEV cDNA encoding the 

25 coat protein is mutated by a standard in vitro 

mutagenesis technique to produce a frameshift mutation 
early in the coat protein structural gene immediately 
followed by three translation termination signal codons. 
These mutations do not affect the sibility of RNA 

30 polymerase to transcribe an RNA molecule from the cDNA 
but prevent translation of the transcribed RNA by 
ribosomes. Those skilled in the art will recognize that 
for the disclosed gene and for other genes, DNA 
sequences can be altered in other ways to cause the DNA 

35 to encode an untranslatable plus sense RNA molecule. 
Thus the disclosed invention is not limited to the 
mutations disclosed. 
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A disclosed embodiment utilizes a cDNA encoding 
the coat protein gene of TEV, mutated so as to encode an 
untranslatable plus sense RNA. It will be obvious to 
one skilled in the art that further sequence alteration 
5 of the cDNA molecule could be used to confer additional 
features on the untranslatable plus sense RNA molecule. 
Additional features include those which would result in 
increased viral resistance of plants tremsformed with 
the CDNA molecule encoding an untranslatable plus sense 
10 RNA. The inclusion of a ribozyme sequence which causes 
the RNA catalyzed destruction of the target RNA molecule 
would constitute one such additional feature. Suitable 
ribozyme sequences are Icnownr as discussed in Tabler and 
Tsagris (1991) . 

15 A DNA construct in accordance with the present 

* 

invention is introduced, via a suitable vector and 
transformation method as described below, into plant 
ceils and plants transformed with the introduced DNA are 
regenerated. Various methods exist for transforming 

20 plant cells and thereby gen^ating transgenic plants. 

Methods which are known or are f otind to be suitable for 
creating stably treoisformed plants can be used in this 
invention. The choice of method will vary with the type 
of plant to be transformed; those skilled in the art 

25 will recognize the suitability of particular methods for 
given plant types. Suitable methods may include, but 
are not limited to: electr operation of plant 
protoplasts; liposome mediated transforation; 
polyethylene mediated transformation; treuisformation 

30 using viruses; microinjection of plant cells; 
micropro j ectile bombardment of plant cells and 
Agroiacterium tumefaalens (AT) mediated transformation. 
The latter technicpie is the method of choice for the 
disclosed preferred embodiment of the present invention. 

35 In an embodiment of the current invention, the 

DHA sequences comprising the CaMV 35S promoter and CaMV 

35S nontranslated 3*^ region and the mutated cDNA 

< 

encoding an xintranslatable plus sense RNA derived from 
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the TEV coat protein gene are combined in a single 
cloning vector. This vector is sxibsequently transformed 
into AT cells and the resultant cells are used to 
transform cultured tobacco cells. 
5 Vectors suitable for the AT mediated 

transformation of plants with the DNA of the invention 
are disclosed. It will be obvious to one skilled in the 
art that a range of suitable vectors is available, 
including those disclosed by Bevan (1983), 
10 Herrera-Estrella (1983), Klee (1985) and EPO publication 
12,516 (Schilperooxi: et al.)- Suitable vectors are 
available on a commercial basis from Clontech (Palo 
Alto, CA) and Pharmacia LKB (Pleasant Hill, CA) and 
other sources. 

15 Following the transformation of plant cells and 

regeneration of transformed plants with the DNA 
molecules as described, regenerated plants are tested 
for increased virus resistance. Plants are preferably 
exposed to the virus at a concentration within a range 

20 where the rate of disease development correlates 

linearly with virus concentration. Methods for virus 
inoculation are well known to those skilled in the art 
and are reviewed by Kado and Agrawai (1972) . One such 
method includes abrading a leaf sxirface with an aqueous 

25 suspension containing an abrasive material such as 

carborundrxim and virus or dusting leaves with such an 
abrasive material and subsequently applying the virus 
onto the leaf surface. A virus suspension can be 
directly inoculated into leaf veins or alternatively 

30 plants can be inoculated using insect vectors. The 

virus suspension may comprise purified virus particles, 
or alternatively, sap from virus infected plants may be 
utilized. 

Transformed plants are then, assessed for 
35 resistance to the virus. The assessment of resistance 
or reduced susceptibility may be manifest in different 
ways dependant on the particular virus type and plant 
type. Those skilled in the art will realize that a 
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comparison of symptom development on a nxambeir of 
inoculated tintransf ormed plants witli symptom development 
on similarly inoculated transformed plants will provide 
a preferred method of determining the effects of 
5 transformation with the specified DNA molecule on plant 
resistance. Symptoms of infection include, but are not 
limited to leaf mottling, chlorosis and etching. Plants 
showing increased viral resistance may be recognized by 
delay in appearance of such symptoms or attenuation or 
10 total lack of such symptoms. 

Example 

Work with tobacco plants and the Tobacco Etch 
Virus (TEV) is illustrative of the invention. 
Construction of aene encoding untranslatable plus sense 
15 RNA molecule. 

The Highly Aphid Transmissible (HAT) isolate of 
Tobacco Etch Virus (TEV) was obtained from Dr. Tom 
Pirone (University of Kentucky) and maintained in 
Nicotlana tabacum (Burley 21) . The virus was purified 

20 from irlcotiana taJbacum (Burley 21) 20 to 30 days 

following inoculation. Viral purification and RNA 
isolation procedures have been described (Dougherty and 
Hiebert (1980a) . Complementary DNA (cDNA) was 
synthesized, made double-stranded and inserted into the 

25 bacterial plasmid pBR322 as described by Allison et al. 
(1985a, 1985b, 1986) , herein incorporated by reference. 
cDNA synthesis was accomplished as follows: Purified 
viral RNA primed with oligo(dTi2-ie) served as a template 
for single-strand cDNA synthesis by reverse 

30 transcriptase. Following the addition of homopolymeric 
tracts of deoxycytidine 5' monophosphate, second-strand 
synthesis, primed with oligo (dGiz-is) / was completed with 
DNA polymerase I. Sail and EcoRL linkers were ligated to 
the double-stranded cDNA and inserted into the bacterial 

35 plasmid pBR322 (Kurtz and Nicodemus 1981) . The 
resulting cDNA clones were screened by colony 
hybridization (Hemahan and Meselson 1980) with 
oligo(dTi2-x8) primed, ^^P-labeled single-stranded TEV 
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cDNA. Plasmid DNA was isolated from colonies which 
hybridized with the probe, and the Sail/ EcoRl cDNA 
inserts were sized by electrophoresis in a 0.8% (w/v) 
agarose gel using a horizontal water-cooled gel 
5 apparatus • 

The Sall/EcoRI inserts from the recombinant 
molecules were isolated from an agarose gel with NA45 
membrane (Schleicher & Schuell, Keene, NH) according to 
the manufacturer's protocol. The following restriction 
10 enzymes were used either alone or in combination to 

digest the isolated cDNA insert: Hindlll, Xhol, AIuI, 
Haelll, i^sal, Sau2k, and Tagl. Restriction enzyme 
digestion products were inserted into the DNA of an 
appropriate H13 bacteriophage (Messing 1983) selected 

15 for the presence of corresponding polylinker restriction 

* 

sites, and their nucleotide sequences were determined by 
dideoxy chain termination. 

Plasmid pTL 37/8595 (Carrington and Dougherty 
1987; Carrington et al. 1987, herein incorporated by 

20 reference) contains a cDNA copy of the genomic sequence 
of HAT TEV corresponding to nucleotides (nt) 1-200 and 
nt 8462-9495 (Fig. 2). (Numbering of the TEV grenome 
nucleotides is according to that presented in Allison et 
al. 1986) . The nucleotide sequence and deduced amino 

25 acid sequence of the Tobacco Etch Virus genome and the 
numbering system utilized by Allison et al. (1986) and 
herein is shown in Fig. 1 and SEQ ID No. 1 in the 
attached sequence listing. The first and last codons of 
the coat protein (CP) coding region in the TEV genome 

30 are nt 8518-8520 (encoding the amino acid serine) and 
9307-9309 (opal stop codon) respectively. pTL 37/8595 
was subject to in vitro site-directed mutagenesis as 
described by Taylor et al. (1985a, 1985b) herein 
incorporated by reference. In all cases, nucleotide 

35 changes were confirmed by dideoxy-nucleotide sequencing 
(Sanger et al. 1977). 

TEV nt 9312-9317 were first mutated (Fig. 2) to 
generate a BamHI restriction site (GGATCC) . TEV nt 
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8516-8521 were then altered to generate an Ncol site 
(CCATGG) f Changing the first codon of the TEV CP coding 
region from AGT .(Ser) , to ATG (Met) . A single 
oligonucleotide was then used to mutate TEV nt 133-138 
5 to a SamHI restriction site (GGATCC) , nt 143-148 to an 
NcoT restriction site (CCATGG) and nt 142 to a 
deoxyadenylate residue. These mutations generated an 
NcoZ site centred on the first codon of the TE7 ORF and 
in a good trans lational start context as described by 

10 Kozak (1984) . Digestion of the resulting plasmid with 
the restriction enzyme Ncol; removing TEV nt 
# 143-200/8462-8516, and religation generated plasmid 
pTC:FL. pTC:FL contained only the TEV CP gene flanked 
by BaxnHI restriction sites and TEV 5' and 2'^ 

15 untranslated sequences (see Fig* 2) . The nucleotide 
sequence of the TEV CP gene in pTC:FL produced by this 
mutagenesis scheme is shown in SEQ 10 No» 2 in the 
attached sequence listing. 

Plasmid pTC:RC (RNA Control, producing 

20 untranslatable plus sense RNA> was generated by 

insertion of a single deoxythymidylate residue after TEV 
nt 8529, and point mutations of TEV nt 8522 (G to C) , 
8534 (C to A) , 8542 (G to A) , and 8543 (A to G) to 
create a frameshift mutation immediately followed by 

25 three stop codons. An Nhel restriction site (GCTA6C) 

was simultaneously generated, for screening purposes, at 
nt 8539-8544. The nucleotide sequence of the TEV CP 
gene in pTC:RC produced by this mutagenesis scheme is 
shown in SEQ ID No. 3 in the attached secpience listing. 

30 All. plasmids described above were linearized 

with Hindlll, transcribed with T7 RNA polymerase (Melton 
et al. 1984) , and translated in a rabbit reticulocyte 
ly sate containing ^^S Methionine (Dougherty and Hiebert 
1980a) . Radiolabeled translation products were analyzed 

35 by electrophoretic separation on a 12.5% acrylamide gel 
containing SDS (Laemmli 1970) and detected by 
autoradiography. Transcripts of plasmid pTC:RC produced 
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no detectable protein products, while transcripts from 

pTC:FL produced proteins of the expected sizes. 

■ 

The various forms of the CP nucleotide sequence 
were then inserted as BaizzHI cassettes into the plant 
5 expression vector pPEV (see below and Fig. 3). 

The full length TEV CP open reading frame of 
pTC:FL was inserted in the reverse orientation to make 
the antisense (AS) construct pTC:AS. The nucleotide 
sequence of the TEV CP gene in pTC:AS is shown in SEQ ID 

10 No. 4 in the attached sequence listing. 
Transformation Vector Construction 

Construction of pPEV. The vector pPEV is part 
of a binary vector system for Agrojbacterium tumefaciens 
mediated plant cell transformation. Flasmid pPEV was 

15 constructed from the plasmids pCGN 2113 (Calgene) , pCZB 
710 and pCIB 200 (Ciba Geigy Corp.)* pCGN 2113 contains 
the "enhanced" Cauliflower Mosaic Virus (CaHV) 35S 
promoter (CaMV sequences -941 to 90/-363 to +2, relative 
to the transcription start site) in a pUC derived 

20 plasmid backbone. pCIB 710 has been described 

(Rothstein et al. 1987) and pCIB 200 is a derivative of 
the wide host range plasmid pTJS 75 (Schmidhauser and 
Helinski 1985) which contains left and right A. 
tvmefaciens T37 DNA borders , the plant selectable 

25 NOS/NPT II chimeric gene from the plasmid Bin 6 (Bevan 
1984) and part of a pUC polylinker. The small 
£coRI-£coRV DNA fragment of pCIB 710 (Rothstein et al. 
1987) was ligated into EcoRL-EcoiRV digested pCGN 2113. 
This regenerated the enhanced CaMV 35S promoter (Kay et 

30 al. 1987) of pCGN 2113 and introduced the CaMV 35S 5' 

and 3' untranslated sequences into pCGN 2113. The CaMV 
35S promoterterminator cassette of the resulting plasmid 
was isolated as an EcoRl-Xbal DNA fragment and ligated 
into EcoRl-Xbal digested pCIB 200 to generate pPEV. CP 

35 nucleotide sequences from PTC:FL, pTC:RC, and pTC:AS 

were cloned as BamHI cassettes into BaioHI digested pPEV 
and orientation of inserts confirmed by digestion with 
appropriate restriction endonucleases . 
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Transf ormation and Reaeneraliion of Tobacco 

pPEV plasmids containing TEV CP ORFs were 
mobilized from E. coli HBlOl into A. tianefaciens A136 

« 

containing plasmid pCIB 542 (Ciba Geigy) r using the 
5 helper plasmid pBK 2013 in E. coli HBlOl and the 
tri-parental mating system of Ditta et al. (1980) • 
Plasmid pCIB 42 supplied vir functions necesseiry for 
T-DNA transfer. 

Leaf discs of Wicotiana taJbacum cv Bxirley 49 

10 were transformed and whole plants regenerated according 
to Horsch et al. (1985) . Transformed tissue was 
selected by culturing callus on MS plates (Murashige and 
Skoog 1962) containing 1 Atg/ml 6-benzylaminopurine 
(Sigma Corp.)/ 01 fig /ml a-naphthaleneacetic acid (Sigma 

15 Corp.), 500 /ig/3Ql carbenicillin and 100 tig/jnl Kanamycin 
sulfate (Sigma Corp.). Shoots were rooted on MS plates 
containing 500 ^g/ml carb^icillin and 100 i^q/ral 
kcuiamycin sulfate, and plamtlets were trsmsplanted into 
soil and transferred directly into the greenhouse 

20 approximately 2-3 weeks after rooting - 

RO, Rl and R2 generation plants were screened 
by western and/ or northern blot analyses. R2 seed (ca. 
100 seeds per R2 plant) was screened for the kanamycin- 
resistant phenotype (kan^) by surface sterilizing seed in 

25 10% bleach for 5 min. , washing twice in sterile water 
and germinating on MS plates containing 100 fig /ml 
kanamycin sulfate. R2 seed lines which were 100% 
kanamycin resistant were screened by western blot 
suialysis for expression of TEV coat protein. Those 

30 transgenic plant lines generated and their nomenclature 
are presented in Fig. 3. 
Molecular Analyses of Transgenic Plants 

Transgenic tobacco plamts were analyzed by 
western and northern blot analyses to determine the 

35 nature of protein and RNA products produced 

respectively. Total RNA samples isolated from the 
various transgenic lines were analyzed in northern blot 
hybridization studies. Total nucleic acids were 
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isolated from tissue and BNA precipitated with LiCl as 
described by Verwoerd et al. (1989). RNAs were 
electrophoretically separated on 1.2% agarose gels 
containing 6% (v/v) formaldehyde and transferred to 

5 nitrocellulose. Prehybridization and hybridization 

conditions were as described in Sambrook et al. (1989). 
Strand specific riboprobes were generated from SP6 or T7 
DNA dependent RNA polymerase transcription reactions of 
pTL 37/8595 linearized with the restriction enzymes 

10 Asp718 (Boehringer Mannheim, Indianapolis, IN) or 
ffijidlll, respectively, using o- labelled "p-ctp 
ribonucleotide and suggested procedures (Promega, 

Madison, WI) . 

An RNA transcript of approximately 1,000 nt was 

15 expected with all transgenic plant lines. Such a TEV CP 
transcript was detected in CP expressing plant lines by 
using a minus sense riboprobe containing the TEV CP 
sequence. A similar transcript was detected in AS 
plants by using a plus sense riboprobe containing the 
20 TEV CP sequence. The transcript in the RC line, while 

detected with a minus sense riboprobe, may have migrated 
as a slightly larger (ca 1,100-1,200 nt) RNA species, 
possibly due to termination at an alternately selected 
site and/ or a longer poly-A tail on the transcript. 
25 Differing levels of CP transcript accumulation were 
observed among different transgenic plant lines. 
Transgenic plant lines expressing the coat protein of 
TEV were identified by western blot analysis using 
polyclonal antisera to TEV CP. Tissue samples of 

30 regenerated plants were ground in 10 volumes of 2X 

Laemmli (Tris-glicine) runner buffer (Laemali 1970) and 
clarified by centrifugation in a microcentrifuge for 10 
min. at 10,000xg. Protein concentration was estimated 
by the dye binding procedure of Bradford (1976) using 

35 BSA as a standard. Protein samples (50 ug total 

protein) were separated on a 12.5% poly.acrylamide gel 
-containing SDS and subjected to the immunoblot transfer 
procedures described by Towbin et al. (1979). Anti-TEV 
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coat protein polyclonal primary antibodies, alkaline 
phosphatase conjugated secondary antibodies and the 
chrokogenic substrates NBT (para-nitro blue tetrazolium 
chloride) and BCIP C5-bromo-4-chloro-3-indoyl phosphate 
para-toluidine salt) were used to detect bound antigen. 

Coat protein products produced in PL plants 
were stable and accumulated to different levels in 
individual transgenic plant lines. It was estimated by 
western blot analysis that between 0.01% to 0.001% of 
total extracted protein was TEY CP. 
AssessTOAiTh of Resistance to 

Eight-week-old (circa 15 cm tall) Ri and R2 
plants were inoculated with either purified virus 
preparations or infected plant sap. Inoculum was 
15 applied with sterile, premoistened cotton swabs. 

Infected plant sap inoculum was prepared by grinding 
TEV-infected N. tabacum Burley 21 leaf tissue (2 weeks 
postinoculation) in carborundum and 50 mM sodium 
phosphate buffer (pH 7.8) at a ratio of lgm:02gm:l0mls, 
respectively, and filtering the homogenate through 
cheesecloth. TEV virons were purified as described by 
Dougherty and Hiebert (1980b) . One leaf per plant was 
dusted lightly with carborundum (320 grit) and 
inoculated at two interveinal locations with 50 jiil 
(total) of inoculum. Inoculated plants were examined 
daily and the appearance and severity of systemic 
symptoms recorded. Symptoms on any leaf above the 
inoculated leaf were considered to be systemic. 

Typically, inoculation of Burley 49 plants with 
TEV (either purified virus or plant sap) resulted in 
severe chlorosis and mosaic and mottle on systemically 
infected leaves approximately 6-7 days after 
inoculation. Severe etching of the leaf followed within 
a few days. It was observed that transgenic plants 
containing only the CaMV promoter and untranslated 
se(iuences (i.e., 35S plant line) responded to challenge 
inoculation in a manner similar to wild type Burley 49, 
developing extensive chlorosis and etching at the same 
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rate (Fig. 4A) . Plant lines which expressed FL TEV CP 
showed little or no delay in the appearance of symptoms 
when inoculated with infected plant sap. However, FL 
transgenic plants did show a slight attenuation of 
5 symptoms and eventually (2-4 weeks after initial 

appearance of symptoms) , younger leaf tissue emerged 
devoid of symptoms and virus as demonstrated by bacjc 
inoculation experiments. Typically chlorosis and 
etching on older systemic leaves was limited. 

10 Ten independently transformed RC lines and 

seven independently transformed AS lines were obtained. 
Progeny from three of the RC lines, including line RC #5 
and from one of the AS lines, including AS #3, showed an 
altered response to viral infection relative to control 

15 plants. All of these lines were verified to be 

transformed and were producing expected RNA products. A 
possible explanation for the variation in observed 
phenotype is the previously noted "position effect" 
whereby the expression of genes from identical DNA 

20 sequences integrated at different locations within the 
genome show varying patterns of tissue specificity. 

Ten R2 expressing plants of the FL expressing 
line were inoculated with infected plant sap, and 20 Rl 
plants of lines AS #3 and RC #5 were inoculated with 

25 50 Ml of a 5 fig/Ml solution of purified TEV. Identical 
results to those obtained by purified TEV inoculation 
were obtained when AS #3 and RC #5 Rl plants were 
inoculated with TEV- infected plant sap, as described 
above . 

30 Transgenic Bur ley 49 plant lines AS #3 and RC 

#5, expressing only TEV CP related RNA sequences, showed 
a delay in the appearance of symptoms and a modification 
of symptoms when inoculated with TEV (Fig. 4B) . Since 
the 20 Rl plants were not screened for expression of CP 

35 RNA prior to inoculation, some of the symptomatic plants 
represented non-expressing plants in which the gene of 
interest had been lost during Mendelian segregation. 
Modified symptoms on AS #3 plants appeared as small 
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chlorotic lesions often associa'ted with a vein. Most of 
tixe leaves were devoid of symptoms and virus (determined 
by back inoculation experiments) . Approximately 15% of 
RC #5 plants showed symptoms which were identical to 
5 those of infected Burley 49. However, the remaining RC 
#5 plants were entirely asymptomatic, and virus was not 
detected in back inoculation studies. 

Plants from TEV resistant AS and RC lines 
showed no increased resistance, relative to 

10 untransformed controls, to infection by two other 

members of the potyvirus family, namely Tobacco Vein 
Mottling Virus and Potato Virus Y. 

R2 generation plants derived from TEV-resistant 
RC plants showed the expected Mendelian pattern of 

15 inheritance of the TEV-resistant phenotype. 

Analysis of TEV Replication in Protoplasts Derived from 
Transgenic Plant Lines 

In an attempt to explain the results obtained 
when AS and RC transgenic plants were challenged with 

20 TEV, it was sought to determine if all of the transgenic 
plant lines would support virus replication at a level 
comparable to Burley 49. Accumulation of viral encoded 
proteins was used as an indirect indicator of viral 
replication. Protoplasts were derived from leaf tissue 

25 of homozygous CP expressing plants and electroporated 

according to the procedxire of Luciano et al. (1987) with 
TEV RHA. Protoplasts were prepared from transgenic 
plants and electroporated according to the procedure of 
Luciano et al. (1987) . Protoplasts (1 X 10^) were 

30 resuspended in 450 ^1 electroporation buffer (330 mM 
mannitol, 1 mM KPO4 pH 7.0, 150 mM KCl) and 
electroporated using a BTX Transfector 300 (BTX San 
Diego, CA) (950 micro Farads, 130-volt pulse amplitude, 
3.5 zom electrode gap) in the presence or absence of 6 fig 

35 of purified TEV RNA. After electroporation, protoplasts 
were incubated for 96 hours in incubation medium as 
described in Luciano et al. (1987). Protoplasts were 
extracted in 2X Laemmli (Trisglycine) rxinning buffer, 
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and 5 X 10* extracted protoplasts were then s\ibjected to 
western blot analysis as described above. Protoplast 

m 

viability was measured by dye exclusion as described in 
Luciano et al. (1987). All electr operated protoplast 
5 samples had equivalent viability counts. The results 
indicated that protoplasts from all FL plant lines 
supported virus replication at levels comparable to wild 
type Biirley 49 protoplasts. Rl transgenic plants from 
lines AS #3 and RC #5 were initially screened by 

10 northern analysis, and leaves from positive expressors 
were used in the production of protoplasts* Transfected 
protoplasts derived from AS #3 plants supported TEV 
replication, albeit at a reduced level. Protoplasts 
derived from RC #5 transgenic plcuit leaf tissue did not 

15 support TEV replication at a detectable level. These 
results, and those presented in the whole plant 
inoculation series, suggested AS and RC plants interfere 
with TEV replication. 

Discussion of Data 

20 The eibove example indicates that varying 

degrees of protection from TEV infection can be achieved 
by over express ion of coat protein and by expression of 
an antisense RNA. The current invention which comprises 
the expression of an untranslatable plus sense RNA 

25 molecule provides protection against TEV infection that 
is more effective than either of these two methods. 
Plants of line RC #5, trsmsformed with the disclosed DNA 
molecule encoding an untranslatable plus sense RNA 
derived from the TEV coat protein gene, were 

30 asymptomatic and appear to be completely protected from 
virus infection. The disclosed invention therefore 
represents a new and effective way of generating 
potyvirus resistant gemplasm. 

Tobacco protoplasts derived from plants 

35 expressing the antisense RNA supported a reduced level 
of TEV replication compared to control cells derived 
from untransformed plants. In contrast, tobacco 
protoplasts derived from plants of line RC #5 , 



expressing the untranslatable plus sense RNA did not 
support detectable TEV replication. This suggests that 
the untranslatable plus sense BHA was more effective at 
blocking TEV replication in the cells of those 
transformed plcoits tested. 

It is proposed that the untranslatable plus 
sense ENA inhibits viral replication by hybridizing to 
the minus sense RNA replicative template of TEV. The 
finding that plants expressing iintranslatable plus sense 
RNA derived from the TEV coat protein gene are not 
protected from infection by Potato Virus Y or Tobacco 
Vein Mottling Virus is therefore explained by the circa 
40-50% amino acid sequence divergence between the coat 
proteins of these viruses and TEV (Allison et al. 1986; 
Robaglia et al. 1989; Domier et al. 1986). 

From tte above-described findings^ it would be 
reasonable and entirely predictable that if plants were 
transformed with a gene encoding an untranslatable plus 
sense RNA derived from a gene which was highly conserved 
between viruses of the potyvirus family, that these 
plants would be protected from infection by a wide range 
of viruses. Regions of the potyvirus genome which are 
sufficiently conserved between potyvirus types to be 
potentially useful in such an approach may be readily 
determined by one skilled in the art. Highly conserved 
regions may be determined by reference to published 
sequence data (Allison et al. 1986; Robaglia et al. 
1989; Domier et al. 1986; Lain et al. 1989; Maiss et al. 
1989) . The utility of the identified regions could be 
readily determined using the methodologies described 
above and substituting the defined region for the TEV 
coat protein gene. 

Regions of the potyvirus genome potentially 
suitable include, but cure not limited to the genes 
encoding the viral replicase and the viral proteinase - 
Furthermore, it will be apparent to one skilled in the 
art that highly conserved portions of a particular gene 
may also serve in this role. 
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It will also be apparent to one skilled in the 
art that the described invention may also be used to 
produce plants resistant to viruses outside of the 
potyvirus family in instcmces where these viruses also 
5 produce a minus sense RNA replicative template. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

* 

(i) APPLICANT: William G. Dougherty and 
John A. Lindbo 

5 (ii) TITLE OF INVENTION: Production of Plants 

Showing Immunity to viral Infection via 
Introduction of Genes Encoding Untranslatable 
Plus Sense RNA Molecules 

(iii) NUMBER OF SEQUENCES: 4 
10 (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Richard J. Polley 

(B) STREET: One World Trade Center 
121 S.W. Salmon Street, Suite 1600 

(C) CITY: Portland 
15 (D) STATE: Oregon 

(E) COUNTRY: United States of America 

(F) ZIP: 97204 
(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 5.25 inch 
20 (B) COMPUTER: IBM PC Compatible 

(C) OPERATING SYSTEM: MS DOS 

(D) SOFTWARE: WordPerfect 5.1 
(Vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 07/838,509 
25 (B) FILING DATE: February 19, 1992 

(C) CLASSIFICATION: 435 
(vi) PRIOR APPLICATION DATA: None 
(Vii) ATTORNEY /AGENT INFORMATION 

(A) NAME: Richard J. Polley, Esq. 
30 (B) REGISTRATION NUMBER: 28,107 

(C) REFERENCE /DOCKET NUMBER: 
245-35829/RJP 

(viii) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (503) 226-7391 

35 (B) TELEFAX: (503) 228-9446 

(2) INFORMATION FOR SEQ ID NO: 1: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9495 

(B) TYPE: Nucleic Acid 
40 (C) STRANDEDNESS : Single 
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(ii) 

(iii) 
(iv) 

(V) 

(vi) 



10 



(vii) 

(viii) 
(ix) 



15 



20 



(X) 



25 



30 



(D) TOPOLOGY: Linear 
MOLECULE TYPE: 

(A) DESCRIPTION: cDNA to genomic FNA 

HYPOTHETICAL: No 
ANTI-SENSE: No 
FRAQIENT TYPE: N/A 
ORIGINAL SOURCE:. 

(A) ORGANISM: Tobacco Etch Virus (TEV) 

(B) STRAIN: Highly Aphid Transmitted 
(HAT) 

IMMEDIATE SOURCE: TEV propagated in N» tabacum 
Bur ley 49 

POSITION IN GENOME: N/A 
FEATURE: 

(A) NAME/KEY: Coat protein gene 

(B) LOCATION: Genomic nucleotides 

8518-9306 

IDENTIFICATION METHOD: — 



(C) 
(D) 



OTHER INFORMATION: SEQ. ID No. 1 xs 
the cDNA corresponding to the Tobacco 
Etch Virus Genome. 



PUBLICATION INFORMATION: 

(A) AUTHORS: Allison et al. 

TITLE: The nucleotide sequence of the 



(B) 



(C) 
(D) 
(E) 
(F) 



coding region of Tobacco Etch Virus 
Genomic RNA: Evidence for the 
Synthesis of a Single Polyprotein 

JOURNAL : Viro logy 

VOLUME: 154 

ISSUE : — 

PAGES: 9-20 



(Xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



10 



NAAAXAACAA AXCTCI^&CAC AACAZ&TACA AAACAAACGA AZCTCAAGCA AXCAA6CATT 60 

CTACITCXAT TGCAGCaftTT TAAATCATTT CTTTTAAaOC AAAAOCaaTT TTCIQaftAAT 120 

TTTCaCCftIT TACGAACGAT AOCA ATO 6CA CTG ATC TTT GGC ACA GTC AAC GCT 174 

Met Ala Leu lie Phe Gly Thr Val Asn Ala 
15 10 

AAC ATC CTG AAG GAA GT6 TTC GGT 6GA GCT C6T ATG GCT TGC GTT ACC 222 
Asn lie Leu Lys Olu Val Phe Gly Gly Ala Arg Met Ala Cys Val Thr 

15 20 25 
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AGC GCA CAT ATG OCT GGA GCG AAT GGA AGC ATT TTG AAG AAG 6CA OAA 270 
Ser Ala His Met Ala Gly Ala Asn Gly Ser lie Leu Lya Lys Ala Glu 

30 35 40 

5 GAG ACC TCT CGT GCA ATC ATG CAC AAA CCA GTG ATC TTC GGA 6AA GAG 318 

Glu Thr Ser Arg Ala lie Met His Lys Pro Val lie Phe Gly Glu Asp 
45 50 55 

TAC ATT ACC GAG GCA GAC TTG CCT TAC ACA CCA CTC CAT TTA GAG GTC 366 
10 Tyr lie Thr Glu Ala Asp Leu Pro Tyr Thr Pro Leu His Leu Glu Val 
60 65 70 

GAT GCT GAA ATG GAG CGG ATG TAT TAT CTT GGT CGT CGC GCG CTC ACC 414 
Asp Ala Glu Met Glu Arg Met Tyr Tyr Leu Gly Arg Arg Ala Leu Thr 
15 75 80 85 90 

CAT GGC AAG AGA CGC AAA GTT TCT GTG AAT AAC AAG AGG AAC AG6 AGA 462 
His Gly Lys Arg Arg Lya Val Ser Val Asn Asn Lys Arg Asn Arg Arg 

95 100 105 

20 

AGG AAA GTG 6CC AAA ACG TAC GTG GGG CGT GAT TCC ATT GTT GAG AAG 510 
Arg Lys Val Ala Lys Thr Tyr Val Gly Arg Asp Ser He Val Glu Lys 

110 115 120 

25 ATT GTA GTG CCC CAC ACC GAG AGA AAG GTT GAT ACC ACA GCA GCA GTG 558 
He Val Val Pro His Thr Glu Arg Lys Val Asp Thr Thr Ala Ala Val 
125 130 135 

GAA GAC ATT TGC AAT GAA GCT ACC ACT CAA CTT GTG CAT AAT AGT ATG 606 
30 Glu Asp He Cys Asn Glu Ala Thr Thr Gin Leu Val His Asn Ser Met 
140 145 150 

CCA AAG CGT AAG AAG CAG AAA AAC TTC TTG CCC GCC ACT TCA CTA AGT 654 

Pro Lys Arg Lys Lys Gin Lys Asn Phe Leu Pro Ala Thr Ser Leu Ser 
35 155 160 165 170 

AAC GTG TAT GCC CAA ACT TGG AGC ATA GTG CGC AAA CGC CAT ATG CAG 702 
Asn Val Tyr Ala Gin Thr Trp Ser He Val Arg Lys Arg His Met Gin 

175 180 185 

40 

GTG GAG ATC ATT AGC AAG AAG AGC GTC CGA GCG AGG GTC AAG AGA TTT 750 

Val Glu He He Ser Lys Lys Ser Val Arg Ala Arg Val Lys Arg Phe 

190 195 200 

45 GAG GGC TCG GTG CAA TTG TTC GCA AGT GTG CGT CAC ATG TAT GGC GAG 798 

Glu Gly Ser Val Gin Leu Phe Ala Ser Val Arg His Met Tyr Gly Glu 
205 210 215 

AGG AAA AGG GTG GAC TTA CGT ATT GAC AAC TGG CAG CAA GAG ACA CTT 846 
50 Arg Lys Arg Val Asp Leu Arg He Asp Asn Trp Gin Glh Glu Thr Leu 
220 225 230 

CTA GAC CTT GCT AAA AGA TTT AAG AAT GAG AGA GTG GAT CAA TCG AAG 894 

Leu Asp Leu Ala Lys Arg Phe Lys Asn Glu Arg Val Asp Gin Ser Lys 
55 235 240 245 250 

CTC ACT TTT GGT TCA AGT GGC CTA GTT TTG AGG CAA GGC TCG TAC GGA 942 

Leu Thr Phe Gly Ser Ser Gly Leu Val Leu Arg Gin Gly Ser Tyr Gly 

255 260 265 



60 



CCT GCG CAT TGG TAT CGA CAT GGT ATG TTC ATT GTA CGC GGT CGG TCG 990 

Pro Ala His Trp Tyr Arg His Gly Met Phe He Val Arg Gly Arg Ser 

270 275 280 



65 GAT GGG ATG TTG GTG GAT GCT CGT GCG AAG GTA ACG TTC GCT GTT TGT 1038 
Asp Gly Met Leu Val Asp Ala Arg Ala Lys Val Thr Phe Ala Val Cys 
285 290 295 
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CAC TCA ATG ACA CAT TAT AGC GAC AAA TCA ATC TCT GAG GCA TTC TTC 1086 

His Ser Het Thr His Tyr Ser Asp Lys Ser Xle Ser Glu Ala Phe Phe 
300 305 310 

5 ATA CCA TAC TCT AAG AAA TTC TTG GAG TTG AGA CCA GAT GGA ATC TCC 1134 
He Pro Tyr Ser Lys Lys Phe Leu Glu Leu Arg Pro Asp Gly He Ser 
315 320 325 330 

CAT GAG TGT ACA AGA GGA GTA TCA GTT GAG CGG TGC GGT GAG GTG GCT 1182 
10 His Glu Cys Thr Arg Gly Val Ser Val Glu Arg Cys Gly Glu Val Ala 

335 340 345 

GCA ATC CTG ACA CAA GCA CTT TCA CCG TGT GGT AAG ATC ACA TGC AAA 1230 
Ala He Leu Thr Gin Ala Leu Ser Pro Cys Gly Lys Xle Thr Cys Lys^ 
15 350 355 360 

CGT TGC ATG GTT 6AA ACA CCT GAC ATT GTT GAG GGT GAG TCG GGA GAA 1278 
Arg Cys Met Val Glu Thr Pro Asp lie Val Glu Gly Glu Ser Gly Glu 
365 370 375 

20 

AGT GTC ACC AAC CAA GGT AAG CTC CTA GCA AT6 CTG AAA GAA CAG TAT 1326 
Ser Val Thr Asn Gin Gly Lys Leu Leu Ala Met Leu Lys Glu Gin Tyr 
380 385 390 

25 CCA GAT TTC CCA ATG 6CC GAG AAA CTA CTC ACA AGG TTT TTG CAA CAG 1374 
Pro Asp Phe Pro Met Ala Glu Lys Leu Leu Thr Arg Phe Leu Gin Gin 
395 400 405 410 

AAA TCA CTA GTA AAT ACA AAT TTG ACA 6CC TGC GTG AGC GTC AAA CAA 1422 
30 Lys Ser Leu Val Asn Thr Asn Leu Thr Ala Cys Val Ser Val Lys Gin 

415 420 425 

CTC ATT GGT GAC CGC AAA CAA GCT CCA TTC ACA CAC GTA CTG GCT GTC 1470 
Leu He Gly Asp Arg Lys Gin Ala Pro Phe Thr His Val Leu Ala Val 
35 430 435 440 

AGC GAA ATT CTG TTT AAA GGC AAT AAA CTA ACA GGG GCT GAT CTC GAA 1518 

•Ser Glu He Leu Phe Lys Gly Asn Lys Leu Thr Gly Ala Asp Leu Glu 
445 450 455 

40 

GAG GCA AGC ACA CAT ATG CTT GAA ATA GCA AGG TTC TTG AAC AAT CGC 1566 

Glu Ala Ser Thr His Met Leu Glu He Ala Arg Phe Leu Asn Asn Arg 
460 465 470 

45 ACT GAA AAT ATG CGC ATT GGC CAC CTT GGT TCT TTC AGA AAT AAA ATC 1614 

Thr Glu Asn Met Arg He Gly His Leu Gly Ser Phe Arg Asn Lys He 
475 480 485 490 

TCA TCG AAG GCC CAT GTG AAT AAC GCA CTC ATG TGT GAT AAT CAA CTT 1662 

50 Ser ser Lys Ala His Val Asn Asn Ala Leu Met Cys Asp Asn Gin Leu 

495 500 505 

GAT CAG AAT GGG AAT TTT ATT TGG GGA CTA AGG GGT GCA CAC GCA AAG 1710 

Asp Gin Asn Gly Asn Phe He Trp Gly Leu Arg Gly Ala His Ala Lys 
55 510 515 520 

AGG TTT CTT AAA GGA TTT TTC ACT GAG ATT GAC CCA AAT GAA GGA TAC 1758 

Arg Phe Leu Lys Gly Phe Phe Thr Glu He Asp Pro Asn Glu Gly Tyr 
525 530 535 



60 



GAT AAG TAT GTT ATC AGG AAA CAT ATC AGG GGT AGC AGA AAG CTA GCA 1806 
Asp Lys Tyr Val He Arg Lys His He Arg Gly Ser Arg Lys Leu Ala 
540 545 550 



65 ATT GGC AAT TTG AXA ATG TCA ACT GAC TTC CAG ACG CTC AGG CAA CAA 

He Gly Asn Leu He Met Ser Thr Asp Phe Gin Thr Leu Arg Gin Gin 
555 560 565 570 



1854 
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ATT CAA GGC GAA ACT ATT GAG CGT AAA GAA ATT GGG AAT CAC TGC ATT 1902 
lie Gin Gly Glu Thr He Glu Arg Lys Glu He Gly Asn His Cys He 

575 580 585 

5 TCA ATG CGG AAT GGT AAT TAG GTG TAG CCA TGT TGT TGT GTT ACT CTT 1950 

Ser Met Arg Asn Gly Asn Tyr Val Tyr Pro Cys Cys Cys Val Thr Leu 

590 595 600 

GAA GAT GGT AAG GCT CAA TAT TCG GAT CTA AAG CAC CCA ACG AAG AGA 1998 
10 Glu Asp Gly Lys Ala Gin Tyr Ser Asp Leu Lys His Pro Thr Lys Arg 

' 605 610 615 

CAT CTG GTC ATT GGC AAC TCT GGC GAT TCA AAG TAC CTA GAC CTT CCA 2046 
His Leu Val He Gly Asn Ser Gly Asp Ser Lys Tyr Leu Asp Leu Pro 
15 620 625 630 

GTT CTC AAT GAA GAG AAA ATG TAT ATA GCT AAT GAA GGT TAT TGC TAC 2094 

Val Leu Asn Glu Glu Lys Met Tyr He Ala Asn Glu Gly Tyr Cys Tyr 
635 640 645 650 

20 

ATG AAC ATT TTC TTT GCT CTA CTA GTG AAT GTC AAG GAA GAG GAT GCA 2142 

Met Asn He Phe Phe Ala Leu Leu Val Asn Val Lys Glu Glu Asp Ala 

655 660 665 

25 AAG GAC TTC ACC AAG TTT ATA AGG GAC ACA ATT GTT CCA AAG CTT GGA 2190 

Lys Asp Phe Thr Lys Phe He Arg Asp Thr He Val Pro Lys Leu Gly 

670 675 680 

GOG TGG CCA ACA ATG CAA GAT GTT GCA ACT GCA TGC TAC TTA CTT TCC 2238 
30 Ala Trp Pro Thr Met Gin Asp Val Ala Thr Ala Cys Tyr Leu Leu Ser 

685 690 695 

ATT CTT TAC CCA GAT GTC CTG AGA GCT GAA CTA CCC AQA ^TT TTG GTT 2286 

He Leu Tyr Pro Asp Val Leu Arg Ala Glu Leu Pro Arg He Leu Val 
35 700 705 710 

GAT CAT GAC AAC AAA ACA ATG CAT GTT TTG GAT TCG TAT GGG TCT AGA 2334 
Asp His Asp Asn Lys Thr Met His Val Leu Asp Ser Tyr Gly Ser Arg 
715 720 725 730 

40 

ACG ACA GGA TAC CAC ATG TTG AAA ATG AAC ACA ACA TCC CAG CTA ATT 2382 
Thr Thr Gly Tyr His Met Leu Lys Met Asn Thr Thr Ser Gin Leu He 

735 740 745 

45 GAA TTC GTT CAT TCA GGT TTG GAA TCC GAA ATG AAA ACT TAC AAT GTT 2430 

Glu Phe Val His Ser Gly Leu Glu Ser Glu Met Lys Thr Tyr Asn Val 

750 755 760 

GGA GGG ATG AAC CGA GAT GTG GTC ACA CAA GGT GCA ATT GAG ATG TTG 2478 
50 Gly Gly Met Asn Arg Asp Val Val Thr Gin Gly Ala He Glu Met Leu 

765 770 775 

ATC AAG TCT ATA TAC AAA CCA CAT CTC ATG AAG CAG TTA CTT GAG GAA 2526 

He Lys Ser. He Tyr Lys Pro His Leu Met Lys Gin Leu Leu Glu Glu 
55 780 785 790 

GAG CCA TAC ATA ATT GTC CTG GCA ATA GTC TCC CCT TCA ATT TTA ATT 2574 
Glu Pro Tyr He He Val Leu Ala He Val Ser Pro Ser He Leu He 
795 800 805 810 



60 



GCC ATG TAC AAC TCT GGA ACT TTT GAG CAG GCG TTA CAA ATG TGG TTG 2622 

Ala Met Tyr Asn Ser Gly Thr Phe Glu Gin Ala Leu Gin Met Trp Leu 

815 820 825 



65 CCA AAT ACA ATG AGG TTA GCT AAC CTC GCT GCC ATC TTG TCA GCC TTA 2670 
Pro Asn Thr Met Arg Leu Ala Asn Leu Ala Ala He Leu Ser Ala Leu 

830 835 840 
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GCG CAK AAG TTA ACT TTG GCA GAT TTG TTC GTC CAG CAG CGT AAT TTG 
Ala Glxi Lys Leu Thr Leu Ala Asp I«eu Phe Val Gin Gin Arg Aen Leu 
845 850 855 

ATX AAT GAG TAT GCG CAG GTA ATT TTG GAC AAT CTG ATT GAC GGT GTC 
He Asn Glu Tyr Ala Gin Val He Leu Aap Asn Leu He Asp Gly Val 
860 . 865 870 

AGO GTT AAT CAT TCG CTA TCC CTA GCA ATG GAA ATT GTT ACT ATT AAG 
Arg Val Asn His Ser Leu Ser Leu Ala Met Glu He Val Thr He Lys 
875 880 885 890 

CTG GCC ACC CAA GAG ATG GAC ATG GCG TTG AGG GAA GGT GGC TAT 6CT 
Leu Ala Thr Gin Glu Met Asp Met Ala Leu Arg Glu Gly Gly Tyr Ala 

895 900 905 

6TG ACC TCT GAA AAG GTG CAT GAA ATG TTG GAA AAA AAC TAT GTA AAG- 

Val Thr Ser Glu Lys Val His Glu Met Leu Glu Lys Asn Tyr Val Lys 

910 915 920 

GCT TTG AAG GAT GCA TGG GAC GAA TTA ACT TGG TTG GAA AAA TTC TCC 
Ala Leu Lys Asp Ala Trp Asp Glu Leu Thr Trp Leu Glu Lys Phe Ser 
925 930 935 

GCA ATC AGG CAT TCA AGA AAG CTC TTG AAA TTT GGG CGA AAG CCT TTA 
Ala He Arg His Ser Arg Lys Leu Leu Lys Phe Gly Arg Lys Pro Leu 
940 945 950 

ATC ATG AAA AAC ACC GTA GAT TGC GGC GGA CAT ATA GAC TTG TCT GTG 
He Met Lys Asn Thr Val Asp Cys Gly Gly His He Asp Leu Ser Val 
955 960 965 970 

AAA TCG CTT TTC AAG TTC CAC TTG GAA CTC CTG AAG GGA ACC ATC TCA 
Lys Ser Leu Phe Lys Phe His Leu Glu Leu Leu Lys Gly Thr He Ser 

975 980 985 

AGA GCC GTA AAT GGT GGC GCA AGA AAG GTA AGA GTA GCG AAG AAT GCC 
Arg Ala Val Asn Gly Gly Ala Arg Lys Val Arg Val Ala Lys Asn Ala 

990 995 1000 

ATG ACA AAA GGG GTT TTT CTC AAA ATC TAC A6C ATG CTT CCT GAC GTC 
Met Thr Lys Gly Val Phe Leu Lys Xle Tyr Ser Met Leu Pro Asp Val 
1005 1010 1015 

TAC AAG TTT ATC ACA GTC TCG AGT GTC CTT TCC TTG TTG TTG ACA TTC 
Tyr Lys Phe He Thr Val Ser Ser Val Leu Ser Leu Leu Leu Thr Phe 
1020 1025 1030 

TTA TTT CAA ATT GAC TGC ATG ATA AGG GCA CAC CGA GAG GCG AAG GTT 

Leu Phe Gin He Asp Cys Met He Arg Ala His Arg Glu Ala Lys Val 
1035 1040 1045 1050 

GCT GCA CAG TTG CAG AAA GAG AGC GAG TGG GAC AAT ATC ATC AAT AGA 
Ala Ala Gin Leu Gin Lys Glu Ser Glu Trp Asp Asn He Xle Asn Arg 

1055 1060 1065 

ACT TTC CAG TAT TCT AAG CTT GAA AAT CCT ATT GGC TAT CGC TCT ACA 
Thr Phe Gin Tyr Ser Lys Leu Glu Asn Pro He Gly Tyr Arg Ser Thr 

1070 1075 1080 

GCG GAG GAA AGA CTC CAA TCA GAA CAC CCC GAG GCT TTC GAG TAC TAC 

Ala Glu Glu Arg Leu Gin Ser Glu His Pro Glu Ala Phe Glu Tyr Tyr 
1085 1090 1095 



2718 



2766 



2814 



2910 



2958 



3006 



3054 



3102 



3150 



3198 



3246 



3294 



3342 



3390 



3438 



65 



AAG TTT TGC ATT GGA AAG GAA GAC CTC GTT GAA CAG GCA AAA CAA CCG 3486 
Lys Phe Cys He Gly Lys Glu Asp Leu Val Glu Gin Ala Lys Gin Pro 
1100 1105 1110 
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GAG ATA GCA TAG TTT GAA AAG ATT ATA GCT TTC ATC ACA CTT GTA TTA 3534 
Glu lie Ala Tyr Phe Glu Lys lie lie AXa Phe lie Thr Leu Val Leu 
1115 1120 1125 1130 

5 ATG GCT TTT GAG GCT GAG CGG AGT GAT GGA GTG TTC AAG ATA CTC AAT 3582 
Met Ala Phe Asp Ala Glu Arg Ser Asp Gly Val Phe Lys lie Leu Asn 

1135 1140 1145 

» 

AAG TTC AAA GGA ATA CTG AGC TCA ACG GAG AG6 GAG ATC ATC TAC AC6 3630 
10 Lys Phe Lys Gly lie Leu Ser Ser Thr Glu Arg Glu lie He Tyr Thr 

1150 1155 1160 

CAG AGT TT6 GAT GAT TAC 6TT ACA ACC TTT GAT GAC AAT ATG ACA ATC 367B 

Gin Ser Leu Asp Asp Tyr Val Thr Thr Phe Asp Asp Asn Met Thr He 
15 1165 1170 1175 

. AAC CTC GAG TTG AAT ATG GAT GAA CTC CAC AAG ACG AGC CTT CCT GGA 3726 

Asn Leu Glu Leu Asn Het Asp Glu Leu His Lys Thr ser Leu Pro Gly 
1180 1185 1190 

20 

GTC ACT TTT AAG CAA TGG TGG AAC AAC CAA ATC AGC CGA GGC AAC GTG 3774 
Val Thr Phe Lys Gin Trp Trp Asn Asn Gin lie Ser Arg Gly Asn Val 
1195 1200 1205 1210 

25 AAG CCA CAT TAT AGA ACT GAG 6GG CAC TTC ATG GAG TTT ACC AGA GAT 3822 
Lys Pro His Tyr Arg Thr Glu Gly His Phe Met Glu Phe Thr Arg Asp 

1215 1220 1225 

ACT GCG GCA TCG GTT GCC AGC GAG ATA TCA CAC TCA CCC GCA AGA GAT 3870 
30 Thr Ala Ala Ser Val Ala Ser Glu He Ser His Ser Pro Ala Arg Asp 

1230 1235 1240 

TTT CTT GTG AGA GGT GCT GTT GGA TCT GGA AAA TCC ACA GGA CTT CCA 3918 

Phe Leu Val Arg Gly Ala Val Gly Ser Gly Lys Ser Thr Gly Leu Pro 
35 1245 1250 1255 

TAC CAT TTA TCA AAG AGA GGG AGA GTG TTA ATG CTT GAG CCT ACC AGA 3966 
Tyr His Leu Ser Lys Arg Gly Arg Val Leu Met Leu Glu Pro Thr Arg 
1260 1265 1270 

40 

CCA CTC ACA GAT AAC ATG CAC AAG CAA CTG AGA AGT GAA CCA TTT AAC 4014 
Pro Leu Thr Asp Asn Met His Lys Gin Leu Arg Ser Glu Pro Phe Asn 
1275 1280 1285 1290 

45 TGC TTC CCA ACT TTG AGG ATG AGA GGG AAG TCA ACT TTT GGG TCA TCA 4062 

Cys Phe Pro Thr Leu Arg Met Arg Gly Lys Ser Thr Phe Gly Ser Ser 

1295 1300 1305 

CCG ATC ACA GTC ATG ACT AGT GGA TTC GCT TTA CAC CAC TTT GCA CGA 4110 
50 Pro He Thr Val Met Thr Ser Gly Phe Ala Leu His His Phe Ala Arg 

1310 1315 1320 

AAC ATA GCT GAG GTA AAA ACA TAC GAT TTT GTC ATA ATT GAT GAA TGT 4158 
Asn He Ala Glu Val Lys Thr Tyr Asp Phe Val He He Asp Glu Cys 
55 1325 1330 1335 

CAT GTG AAT GAT GCT TCT GCT ATA GCG TTT AGG AAT CTA CTG TTT GAA 4206 
His Val Asn Asp Ala Ser Ala He Ala Phe Arg Asn Leu Leu Phe Glu 
1340 1345 1350 



60 



CAT GAA TTT GAA GGA AAA GTC CTC AAA GTG TCA GCC ACA CCA CCA GGT 4254 
His Glu Phe Glu Gly Lys Val Leu Lys Val Ser, Ala Thr Pro Pro Gly 
1355 1360 1365 1370 



65 



AGA GAA GTT GAA TTT ACA ACT CAG TTT CCC GTG AAA CTC AAG ATA GAA . 4302 
Arg Glu Val Glu Phe Thr Thr Gin Phe Pro Val Lys Leu Lys He Glu 

1375 1380 1385 
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GAG GCT CTT AGC TTT CAG GAA TTT GTA AGT TTA CAA GGG ACA GGT GCC 

Glu Ala Leu ser Phe Gin Glu Phe Val Ser Leu Gin Gly Thr Gly Ala 

1390 1395 1400 

AAC GCC GAT GTG ATT AGT TGT GGC GAC AAC ATA CTA GTA TAT GTT GCT 
Asn Ala Asp Val Xle Ser Cys Gly Asp Asn lie Leu Val Tyr Val Ala 
1405 1410 1415 

AGC TAG AAT GAT GTT GAT AGT CTT GGC AAG CTC CTT GTG CAA AAG GGA 
Ser Tyr Asn Asp Val Asp Ser Leu Gly Lys Leu Leu Val Gin Lys Gly 
1420 1425 1430 

TAC AAA GTG TCG AAG ATT GAT GGA AGA ACA ATG AAG AGT GGA GGA ACT 
Tyr Lys Val Ser Lys lie Asp Gly Arg Thr Met Lys Ser Gly Gly Thr 
1435 1440 1445 1450 

GAA ATA ATC ACT GAA GGT ACT TCA GTG AAA AAG CAT TTC ATA 6TC GCA 
Glu lie lie Thr Glu Gly Thr Ser Val Lys Lys His Phe lie Val Ala 

1455 1460 1465 

ACT AAC ATT ATT GAG AAT GGT GTA ACC ATT GAC ATT GAT GTA GTT GTG 
Thr Asn He He Glu Asn Gly Val Thr lie Asp He Asp Val Val Val 

1470 1475 1480 

GAT TTT GGG ACT AAG GTT GTA CCA GTT TTG GAT GTG GAC AAT AGA GCG 
Asp Phe Gly Thr Lys Val Val Pro Val Leu Asp Val Asp Asn Arg Ala 
1481 1490 1495 

GTG CAG TAC AAC AAA ACT GTG GTG AGT TAT GGG GAG OGC ATC CAA AAA 

Val Gin Tyr Asn Lys Thr Val Val Ser Tyr Gly Glu Arg lie Gin Lys 
1500 1505 1510 

CTC GGT AGA GTT GGG CGA CAC AAG GAA GGA GTA GCA CTT CGA ATT GGC 

Leu Gly Arg Val Gly Arg His Lys Glu Gly Val Ala Leu Arg He Gly 
1515 1520 1525 1530 

CAA ACA AAT AAA ACA CTG GTT GAA ATT CCA GAA ATG GTT GCC ACT GAA 
Gin Thr Asn Lys Thr Leu Val Glu He Pro Glu Met Val Ala Thr Glu 

1535 1540 1545 

GCT GCC TTT CTA TGC TTC ATG TAC AAT TTG CCA GTG ACA ACA CAG AGT 
Ala Ala Phe Leu Cys Phe Met Tyr Asn Leu Pro Val Thr Thr Gin Ser 

1550 1555 1560 

GTT TCA ACC ACA CTG CTG GAA AAT GCC ACA TTA TTA CAA GCT AGA ACT 

Val Ser Thr Thr Leu Leu Glu Asn Ala Thr Leu Leu Gin Ala Arg Thr 
1565 1570 1575 

ATG GCA CAG TTT GAG CTA TCA TAT TTT TAC ACA ATT AAT TTT GTG CGA 
Met Ala Gin Phe Glu Leu Ser Tyr Phe Tyr Thr He Asn Phe Val Arg 
1580 1585 1590 

TTT GAT GGT AGT ATG CAT CCA GTC ATA CAT GAC AAG CTG AAG CGC TTT 
Phe Asp Gly Ser Met His Pro Val He His Asp Lys Leu Lys Arg Phe 
1595 1600 1605 1610 

AAG CTA CAC ACT TGT GAG ACA TTC CTC AAT AAG TTG GCG ATC CCA AAT 
Lys Leu His Thr Cys Glu Thr Phe Leu Asn Lys Leu Ala He Pro Asn 

1615 1620 1625 

AAA GGC TTA TCC TCT TGG CTT ACG AGT GGA GAG TAT AAG CGA CTT GGT 
Lys Gly Leu Ser Ser Trp Leu Thr Ser Gly Glu Tyr Lys Arg Leu Gly 

1630 1635 1640 



4350 



4398 



4446 



4494 



4542 



4590 



4638 



4686 



4734 



4782 



4830 



4878 



4926 



4974 



5022 



5070 



65 TAC ATA GCA GAG GAT GCT GGC ATA AGA ATC CCA TTC GTG TGC AAA GAA 
Tyr He Ala Glu Asp Ala Gly He Arg He Pro Phe Val Cys Lys Glu 
1645 1650 1655 
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ATT CCA GAC TCC TTG CAT GAG 6AA ATT TGG CAC ATT GTA GTC GCC CAT 5166 

lie Pro Asp Ser Leu His Glu Glu lie Trp His lie Val Val Ala His 
1660 1665 1670 

5 AAA 6GT GAC TCG GGT ATT GGG AGG CTC ACT AGC GTA CAG GCA GCA AA6 5214 
Lys Gly Asp Ser Gly lie Gly Arg Leu Thr Ser Val Gin Ala Ala Lys 
1675 1680 1685 1690 

GTT GTT TAT ACT CTG CAA ACG GAT GTG CAC TCA ATT GCG AGG ACT CTA 5262 

10 Val Val Tyr Thr Leu Gin Thr Asp Val His Ser lie Ala Arg Thr Leu 

1695 1700 1705 

GCA TGC ATC AAT AGA CGC ATA GCA GAT GAA CAA ATG AAG CAG AGT CAT 5310 

Ala Cys lie Asn Arg Arg lie Ala Asp Glu Gin Met Lys Gin Ser His 
15 1710 1715 1720 

TTT GAA GCC GCA ACT GGG AGA GCA TTT TCC TTC ACA AAT TAC TCA ATA 5358 
Phe Glu Ala Ala Thr Gly Arg Ala Phe Ser Phe Thr Asn Tyr Ser lie 
1725 1730 1735 

20 

CAA AGC ATA TTT GAC ACG CTG AAA GCA AAT TAT GCT ACA AAG CAT ACG 5406 

Gin Ser lie Phe Asp Thr Leu Lys Ala Asn Tyr Ala Thr Lys His Thr 

1740 1745 1750 

25 AAA GAA AAT ATT GCA GTG CTT CAG CAG GCA AAA GAT CAA TTG CTA GAG 5454 
Lys Glu Asn lie Ala Val Leu Gin Gin Ala Lys Asp Gin Leu Leu Glu 
1755 1760 1765 1770 

TTT TCG AAC CTA GCA AAG GAT CAA GAT GTC ACG GGT ATC ATC CAA GAC 5502 

30 Phe Ser Asn Leu Ala Lys Asp Gin Asp Val Thr Gly He He Gin Asp 

1775 1780 1785 

TTC AAT CAC CTG GAA ACT ATC TAT CTC CAA TCA GAT AGC GAA GTG GCT 

Phe Asn His Leu Glu Thr He Tyr Leu Gin Ser Asp Ser Glu Val Ala 
35 1790 1795 1800 

AAG CAT CTG AAG CTT AAA AGT CAC TGG AAT AAA AGC CAA ATC ACT AGG 
Lys His Leu Lys Leu Lys Ser His Trp Asn Lys Ser Gin He Thr Arg 
1805 1810 1815 

40 

GAC ATC ATA ATA GCT TTG TCT GTG TTA ATT GGT GGT GGA TGG ATG CTT 5646 

Asp He He He Ala Leu Ser Val Leu He Gly Gly Gly Trp Het Leu 
1820 1825 1830 

45 GCA ACG TAC TTC AAG GAC AAG TTC AAT GAA CCA GTC TAT TTC CAA GGG 5694 

Ala Thr Tyr Phe Lys Asp Lys Phe Asn Glu Pro val Tyr Phe Gin Gly 
1835 1840 1845 1850 

AAG AAG AAT CAG AAG CAC AAG CTT AAG ATG AGA GAG GCG CGT GGG GCT 5742 
50 Lys Lys Asn Gin Lys His Lys Leu Lys Met Arg Glu Ala Arg Gly Ala 

1855 1860 1865 

AGA GGG CAA TAT GAG GTT GCA GCG GAG CCA GAG GCG CTA GAA CAT TAC 5790 

Arg Gly Gin Tyr Glu Val Ala Ala Glu Pro Glu Ala Leu Glu His Tyr 
55 1870 1875 1880 

TTT GGA AGC GCA TAT AAT AAC AAA GGA AAG CGC AAG GGC ACC ACG AGA 5838 
Phe Gly Ser Ala Tyr Asn Asn Lys Gly Lys Arg Lys Gly Thr Thr Arg 
1885 1890 1895 

60 

GGA ATG GGT GCA AAG TCT CGG AAA TTC ATA AAC ATG TAT GGG TTT GAT 5686 
Gly Met Gly Ala Lys Ser Arg Lys Phe He Asn Met Tyr Gly Phe Asp 
1900 1905 1910 

65 CCA ACT GAT TTT TCA TAC ATT AGG TTT GTG GAT CCA TTG ACA GGT CAC 5934 
Pro Thr Asp Phe Ser Tyr He Arg Phe Val Asp Pro. Leu Thr Gly His 
1915 1920 1925 1930 
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ACT ATT GAT GAG TCC ACA AAC GCA CCT ATT GAT TTA GTG CAG CAT GAG 

Thr lie Asp Glu Ser Thr Asn Ala Pro He Asp Leu Val Gin His Glu 

1935 1940 1945 

TTT GGA AAG GTT AGA ACA CGC ATG TTA ATT GAC GAT GAG ATA GAG CCT 
Phe Gly Lys Val Arg Thr Arg Met Leu He Asp Asp Glu He Glu Pro 

1950 1955 1960 

CAA AGT CTT AGC ACC CAC ACC ACA ATC CAT GCT TAT TTG GTG AAT AGT 

Gin Ser Leu Ser Thr His Thr Thr He His Ala Tyr Leu Val Asn Ser 
1965 1970 1975 

GGC ACG AAG AAA GTT CTT AAG GTT GAT TTA ACA CCA CAC TCG TCG CTA 
Gly Thr Lys Lys Val Leu Lys Val Asp Leu Thr Pro His Ser Ser Leu 
1980 1985 1990 

CGT GCG AGT GAG AAA TCA ACA GCA ATA ATG GGA TTT CCT GAA AGG GAG 

Arg Ala Ser Glu Lys Ser Thr Ala He Met Gly Phe Pro Glu Arg Glu 
1995 2000 2005 2010 



5982 





AAT 
Asn 


GAA 
Glu 


TTG 
Leu 


CGT 

Arg 


CAA ACC 
Gin Thr 
2015 


GGC 
Gly 


ATG 
Met 


GCA 

Ala 


GTG 
Val 
202G 


CCA 
Pro 

) 


GTG 

Val 


GCT 
Ala 


TAT 
Tyr 


GAT CAA 

Asp Gin 
2025 


25 


TTG 
Leu 


CCA 
Pro 


CCA 
Pro 


AAG AAT 
Lys Asn 
2030 


GAG 
Glu 


GAC 
Asp 


TTG 
Leu 


ACG 

Thr 
203S 


TTT 
Phe 


GAA 
Glu 


GGA 
Gly 


GAA 
Glu 


AGC TTG 
Ser Leu 
2040 


TTT 
Phe 


30 


AAG 
Lys 


GGA 
Gly 


CCA CGT 
Pro Arg 
2045 


GAT 
Asp 


TAG 

Tyr 


AAC 
Asn 


CCG ATA 
Pro He 
2050 


TCG 

Ser 


AGC 
Ser 


ACC 

Thr 


ATT TGT 
He Cys 
2055 


CAT 
His 


TTG 
Leu 


35 


ACG 
Thr 


AAT GAA 
Aan Glu 
2060 


TCT 
Ser 


GAT 
Asp 


GGG 
Gly 


CAC ACA 
His Thr 
2065 


ACA 
Thr 


TCG 

Sier 


TTG 
Leu 


TAT GGT 
Tyr Gly 
2070 


ATT 
He 


GGA 

Gly 


TTT 
Phe 


40 


GGT CCC 

Gly Pro 
2075 


TTC 
Phe 


ATC 
He 


ATT 

He 


ACA AAC 
Thr Asn 
2080 


AAG 
Lys 


CAC 
His 


TTG 
Leu 


TTT AGA 
Phe Arg 
2085 


AGA 
Arg 


AAT 

Asn 


AAT 

Asn 


GGA 
Gly 
2090 




ACA 

Thr 


CTG 
Leu 


TTG 
Leu 


GTC 
Val 


CAA TCA 
Gin Ser 
2095 


CTA 
Leu 


CAT 
His 


GGT 

Gly 


6TA TTC 
Val Phe 
2100 


AAG 
Lys 


GTC 

Val 


AAG 
Lys 


AAC ACC 
Asn Thr 
2105 


45 


ACG 
Thr 


ACT 
Thr 


TTG 
Leu 


CAA CAA 
Gin Gin 
2H0 


CAC 
His 

• 


CTC 
Leu 


ATT 
He 


GAT GGG 
Asp Gly 
2115 


AGG 

Arg 


GAC 
Asp 


ATG 

Met 


ATA ATT 
He He 
2120 


ATT 
He 


50 


CGC 

Arg 


ATG 
Met 


CCT AAG 
Pro Lys 
212S 


GAT 
Asp 


TTC 
Phe 


CCA 
Pro 


CCA TTT 
Pro Phe 
2130 


CCT 
Pro 


CAA 
Gin 


AAG 
Lys 


CTG AAA 
Leu Lys 
2135 


TTT 
Phe 


AGA 
Arg 


55 


GAG 
Glu 


CCA CAA 
Pro Gin 
2140 


AGG 
Arg 


GAA 
Glu 


GAG 
Glu 


CGC ATA 
Arg He 
2145 


TGT 
Cys 


CTT 
Leu 


GTG 
Val 


ACA ACC 

Thr Thr 
2150 


AAC 
Asn 


TTC 
Phe 


CAA 
Gin 


60 


ACT AAG 
Thr Lys 
2155 


AGC 
Ser 


ATG 

Met 


TCT 

ser 


AGC ATG 
Ser Met 
2160 


GTG 

Val 


TCA 
Ser 


GAC 
Asp 


ACT AGT 
Thr Ser 
2165 


TGC 
Cys 


ACA 

Thr 


TTC 

Phe 


CCT 
Pro 
2170 


TCA 

Ser 


TCT 
Ser 


GAT 
Asp 


GGC 
Gly 


ATA TTC TGG AAG 
He Phe Trp Lys 
2175 


CAT 

His 


TGG ATT 
Trp He 
2180 


CAA 

Gin 


ACC 
Thr 


AAG 
Lys 


GAT GGG 
Asp Gly 
2185 



6030 



6078 



6126 



6174 



6222 



6270 



6318 



6366 



6414 



6462 



6510 



6558 



6606 



6654 



6702 



65 CAG TGT GGC AGT CCA TTA GTA TCA ACT AGA GAT GGG TTC ATT GTT GGT 
Gin Cys Gly Ser Pro Leu Val Ser Thr Arg Asp Gly Phe He Val Gly 

2190 2195 2200 
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ATA CAC TCA GCA TCG AAT TTC ACC AAC ACA AAC AAT TAT TTC ACA AGC 6798 
lie His Ser Ala Ser Asn Phe Thr Asn Thr Asn Asn Tyr Phe Thr Ser 
2205 2210 2215 

5 GTG CCG AAA AAC TTC ATG GAA TTG TTG ACA AAT CAG GAG GCG CAG GAG 6846 
Val Pro Lys Asn Phe Met Glu Leu Leu Thr Asn Gin Glu Ala Gin Gin 
2220 . 2225 2230 

TGG GTT A6T GGT T6G CGA TTA AAT OCT GAG TCA GTA TTG TGG GG6 G6C 6894 
10 Trp Val Ser Gly Trp Arg Leu Asn Ala Asp Ser Val Leu Trp Gly Gly 
2235 2240 2245 2250 

CAT AAA GTT TTC ATG AGC AAA CCT GAA GAG CCT TTT CAG CCA GTT AAG 6942 
His Lys Val Phe Met Ser Lys Pro Glu Glu Pro Phe Gin Pro Val Lys 
15 2255 2260 2265 



GAA GCG ACT CAA CTC ATG AAT GAA TTG GTG TAC TCG CAA GGG GAG AAG 6990 

Glu Ala Thr Gin Leu Met Asn Glu Leu Val Tyr Ser Gin Gly Glu Lys 

2270 2275 2280 

20 

AGG AAA TGG GTC GTG GAA GCA CTG TCA GGG AAC TTG AGG CCA GTG 6CT 7038 
Arg Lys Trp Val Val Glu Ala Leu Ser Gly Asn Leu Arg Pro Val Ala 
2285 2290 2295 



25 GAG TGT CCC AGT CAG TTA GTC ACA AAG CAT GTG GTT AAA GGA AAG T6T 7086 

Glu Cys Pro Ser Gin Leu Val Thr Lys His Val Val Lys Gly Lya Cys 
2300 2305 2310 

CCC CTC TTT GAG CTC TAC TTG CAG TTG AAT CCA GAA AAG GAA GCA TAT 7134 
30 Pro Leu Phe Glu Leu Tyr Leu Gin Leu Asn Pro Glu Lys Glu Ala Tyr 
2315 2320 2325 2330 

TTT AAA CCG ATG ATG GGA GCA TAT AAG CCA AGT CGA CTT AAT AGA GAG 7182 

Phe Lys Pro Met Met Gly Ala Tyr Lys Pro Ser Arg Leu Asn Arg Glu 
35 2335 2340 2345 

GCG TTC CTC AAG GAC ATT CTA AAA TAT GCT AGT GAA ATT GAG ATT GGG 7230 

Ala Phe Leu Lys Asp lie Leu Lys Tyr Ala Ser Glu lie Glu lie Gly 

2350 2355 2360 

40 

AAT GTG GAT TGT GAC TTG CTG GAG CTT GCA ATA AGC ATG CTC GTC ACA 7278 
Asn Val Asp Cys Asp Leu Leu Glu Leu Ala lie Ser Met Leu Val Thr 
2365 2370 2375 

45 AAG CTC AAG GCG TTA GGA TTC CCA ACT GTG AAC TAC ATC ACT GAC CCA 7326 
Lys Leu Lys Ala Leu Gly Phe Pro Thr Val Asn Tyr lie Thr Asp Pro 
2380 2385 2390 

GAG GAA ATT TTT AGT GCA TTG AAT ATG AAA GCA GCT ATG GGA GCA CTA 7374 

50 Glu Glu He Phe Ser Ala Leu Asn Met Lys Ala Ala Met Gly Ala Leu 
2395 2400 2405 2410 

TAC AAA 6GC AAG AAG AAA GAA GCT CTC AGC GAG CTC ACA CTA GAT GAG 7422 
Tyr Lys Gly Lys Lys Lys Glu Ala Leu Ser Glu Leu Thr Leu Asp Glu 
55 2415 2420 2425 



CAG GAG GCA ATG CTC AAA 

Gin Glu Ala Met Leu Lys 

2430 

60 

TTG GGA ATT TGG AAT GGC 
Leu Gly He Trp Asn Gly 
2445 



GCA AGT TGC CTG CGA CTG 
Ala Ser Cys Leu Arg Leu 
2435 

TCA TTG AAA GCA GAG TTG 
Ser Leu Lys Ala Glu Leu 
2450 



TAT ACG GGA AAG 7470 
Tyr Thr Gly Lys 
2440 

CGT CCA ATT GAG 7518 

Arg Pro He Glu 

2455 



65 



AAG GTT GAA AAC AAC AAA ACG CGA ACT TTC ACA GCA GCA CCA ATA GAC 

Lys Val Glu Asn Asn Lys Thr Arg Thr Phe Thr Ala Ala Pro He Asp 
2460 2465 2470 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



aCT CTT CTT GCT GGT A£A GTT TGC GTG GAT GAT TTC AAC AAT CAA TTT 
Thr Leu Leu Ala Gly Lys Val Cys Val Asp Asp Phe Asn Asn Gin Phe 
2475 2480 2485 2490 

TAT GAT CTC AAC ATA AAG GCA CCA TGG ACA GTT GGT ATG ACT AAG TTT 

Tyr Asp' Leu Asn lie Lys Ala Pro Trp Thr Val Gly Met Thr Lys Phe 

2495 ■ 2500 2505 

TAT CAG GGG TGG AAT GAA TTG ATG GAG GCT TTA CCA AGT GGG TGG GTG 
Tyr Gin Gly Trp Asn Glu Leu Met Glu Ala Leu Pro Ser Gly Trp Val 

2510 2515 2520 

TAT TGT GAC GCT GAT GGT TCG C2^ TTC GAC AGT TCC TTG ACT CCA TTC 

Tyr Cys Asp Ala Asp Gly Ser Gin Phe Asp Ser Ser Leu Thr Pro Phe 
2525 2530 2535 

CTC ATT AAT GCT GTA TTG AAA GTG CGA CTT GCC TTC ATG GAG GAA TGG 
Leu lie Asn Ala Val Leu Lys Val Arg Leu Ala Phe Met Glu Glu Trp 
2540 2545 2550 

GAT ATT GGT GAG CAA ATG CTG CGA AAT TTG TAC ACT GAG ATA GTG TAT 
Asp lie Gly Glu Gin Met Leu Arg Asn Leu Tyr Thr Glu He Val Tyr 
2555 2560 2565 2570 

ACA CCA ATC CTC ACA CCC GAT GGT ACT ATC ATT AAG AAG CAT AAA GGC 
Thr Pro He Leu Thr Pro Asp Gly Thr lie lie Lys Lys His Lys Gly 

2575 2580 2585 

AAC AAT AGC GGG CAA CCT TCA ACA GTG GTG GAC AAC ACA CTC ATG GTC 

Asn Asn Ser Gly Gin Pro Ser Thr Val Val Asp Asn Thr Leu Met Val 

2590 2595 2600 

ATT ATT GCA ATG TTA TAC ACA TGT GAG AAG TGT GGA ATC AAC AAG GAA 
He He Ala Met Leu Tyr Thr Cys Glu Lys Cys Gly He Asn Lys Glu 
2605 2610 2615 

GAG ATT GTG TAT TAC GTC AAT GGC GAT 6AC CTA TTG ATT GCC ATT CAC 

Glu He Val Tyr Tyr Val Asn Gly Asp Asp Leu Leu He Ala He His 
2620 2625 2630 



CCA GAT AAA GCT GAG AGG TTG AGT AGA TTC AAA GAA TCT TTC GGA GAG 
Pro Asp Lys Ala Glu Arg Leu Ser Arg Phe Lys Glu Ser Phe Gly Glu 
2635 2640 2645 2650 

TTG GGC CTG AAA TAT GAA TTT GAC TGT ACC ACC AGG GAC AAG ACA CAG 
Leu Gly Leu Lys Tyr Glu Phe Asp Cys Thr Thr Arg Asp Lys Thr Gin 

2660 2665 



TTG TGG TTC ATG TCA CAC AGG GCT TTG GAG AGG GAT GGC ATG TAT ATA 
Leu Trp Phe Met Ser His Arg Ala Leu Glu Arg Asp Gly Met Tyr He 

2670 2675 2680 

CCA AAG CTA GAA GAA GAA AGG ATT GTT TCT ATT TTG GAA TGG GAC AGA 

Pro Lys Leu Glu Glu Glu Arg He Val Ser He Leu Glu Trp Asp Arg 
2685 2690 2695 

TCC AAA GAG CCG TCA CAT AGG CTT GAA GCC ATC TGT GCA TCA ATG ATT 
Ser Lys Glu Pro Ser His Arg Leu Glu Ala He Cys Ala Ser Met He 
2700 2705 2710 

GAA GCA TGG GGT TAT GAC AAG CTG GTT GAA GAA ATC CGC AAT TTC TAT 

Glu Ala Trp Gly Tyr Asia Lys Leu Val Glu Glu He Arg Asn Phe Tyr 
2715 2720 2725 2730 

GCA TGG GTT TTG GAA CAA GCG CCG TAT TCA CAG CTT GCA GAA GAA GGA 
Ala Trp Val Leu Glu Gin Ala Pro Tyr Ser Gin Leu Ala Glu Glu Gly 

2735 2740 2745 



7614 



7662 



7710 



7758 



7806 



7854 



7902 



7950 



7998 



8046 



8094 



8142 



8190 



8238 



8286 



8334 



8382 
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AAG GCG CCA TAT CTG OCT GAG ACT GCG CTT AAG TTT TTG TAC ACA TCT 8430 

Lys Ala Pro Tyr Leu Ala Glu Thr Ala Leu Lys Phe Leu Tyr Thr Ser 

2750 2755 2760 

5 CAG CAC GGA ACA AAC TCT GAG ATA GAA GAG TAT TTA AAA GTG TTG TAT 8478 
Gin His Gly Thr Asn Ser Glu lie Glu Glu Tyr Leu Lys Val Leu Tyr 
2765 2770 2775 

GAT TAC GAT ATT CCA ACG ACT GAG AAT CTT TAT TTT CAG ACT GGC ACT 8526 

10 Asp Tyr Asp lie Pro Thr Thr Glu Asn Leu Tyr Phe Gin Ser Gly Thr 
2780 2785 2790 

GTG GAT GCT GGT GCT GAC 6CT GGT AAG AAG AAA GAT CAA AAG GAT GAT 8574 
Val Asp Ala Gly Ala Asp Ala Gly Lys Lys Lys Asp Gin Lys Asp Asp 
15 2795 2800 2805 2810 

AAA GTC GCT GAG CAG GCT TCA AAG GAT AGG GAT GTT AAT GCT GGA ACT 8622 

Lys Val Ala Glu Gin Ala Ser Lys Asp Arg Asp Val Asn Ala Gly Thr 

2815 2820 2825 

20 

TCA GGA ACA TTC TCA GTT CCA CGA ATA AAT GCT ATG GCG ACA AAA CTT 8670 

Ser Gly Thr Phe Ser Val Pro Arg lie Asn Ala Met Ala Thr Lys Leu 

2830 2835 2840 

25 CAA TAT CCA AGG ATG AGG GGA GAG GTG GTT 6TA AAC TTG AAT CAC CTT 8718 

Gin Tyr Pro Arg Met Arg Gly Glu Val Val Val Asn Leu Asn His Leu 
2845 2850 2855 

TTA GGA TAC AAG CCA CAG CAA ATT GAT TTG TCA AAT GCT CGA GCC ACA 8766 
30 Leu Gly Tyr Lys Pro Gin Gin He Asp Leu Ser Asn Ala Arg Ala Thr 
2860 2865 2870 

CAT GAG CAG TTT GCC GCG TGG CAT CAG GCA GTG ATG ACA GCC TAT GGA 8814 
His Glu Gin Phe Ala Ala Trp His Gin Ala Val Met Thr Ala Tyr Gly 
35 2875 2880 2885 2890 

GTG AAT GAA GAG CAA ATG AAA ATA TTG CTA AAT GGA TTT ATG GTG TGG 8862 

Val Asn Glu Glu Gin Met Lys He Leu Leu Asn Gly Phe Met Val Trp 

2895 2900 2905 

40 

TGC ATA GAA AAT GGG ACT TCC CCA AAT TTG AAC GGA ACT TGG GTT ATG 8910 
Cys lie Glu Asn Gly Thr Ser Pro Asn Leu Asn Gly Thr Trp Val Met 

2910 2915 2920 

45 ATG GAT GGT GAG GAT CAA GTT TCA TAC CCG CTG AAA CCA ATG GTT GAA 8958 
Met Asp Gly Glu Asp Gin Val Ser Tyr Pro Leu Lys Pro Met Val Glu 
2925 2930 2935 

AAC GCG CAG CCA ACA CTG AGG CAA ATT ATG ACA CAC TTC AGT GAC CTG 9006 
SO Asn Ala Gin Pro Thr Leu Arg Gin lie Met Thr His Phe Ser Asp Leu 
2940 2945 2950 

GCT GAA GCG TAT ATT GAG ATG AGG AAT AGG GAG CGA CCA TAC ATG CCT 9054 
Ala Glu Ala Tyr lie Glu Met Arg Asn Arg Glu Arg Pro Tyr Met Pro 
55 2955 2960 2965 2970 

AGG TAT GGT CTA CAG AGA AAC ATT ACA GAC ATG AGT TTG TCA CGC TAT 9102 

Arg Tyr Gly Leu Gin Arg Asn He Thr Asp Met Ser Leu Ser Arg Tyr 

2975 2980 2985 

60 

GCG TTC GAC TTC TAT GAG CTA ACT TCA AAA ACA CCT GTT AGA GCG AGG 9150 

Ala Phe Asp Phe Tyr Glu Leu Thr Ser Lys Thr Pro Val Arg Ala Arg 

2990 2995 3000 

65 GAG GCG CAT ATG CAA ATG AAA GCT GCT GCA GTA CGA AAC. AGT GGA ACT 9198 

Glu Ala His Met Gin Met Lys Ala Ala Ala Val Arg Asn Ser Gly Thr 
3005 3010 3015 
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AGG TTA TTT GGT CTT GAT GGC AAC GTG GGT ACT GCA GAG GAA GAC ACT 9246 

Arg Leu Phe Gly Leu Asp Gly Aan Val Giy Thr Ala GIu Glu Asp Thr 
3020 3025 3030 

5 GAA CGG CAC ACA GCG CAC GAT GTG AAC CGT AAC ATG CAC ACA CTA TTA 9294 
Glu Arg His Thr Ala His Asp Val Asn Arg Asn Met His Thr Leu Leu 
3035 3040 3045 3050 

GGG GTC CGC CAG TGA TAGTTTCTGC GTGTCTTTGC TTTCCGCTTT TAAGCTTATT 9349 

10 Gly Val Arg Gin 

GTAATATATA TGAATAGCTA TTCACAGTGG GACTTGGTCT TGT6TT6AAT AGTATCTTAT 9409 

ATAITTTAAT AIGICTTATT AGTCTCATTA CTTA6GCGAA CGACAAAGTG AGGTCACCTC 9469 

GGTCTAATTC TCCTAT6TAG T606AG ^49 5 



15 



(3) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 792 

(B) TYPE: Nucleic Acid 

( C) STRANDEDNESS : Double 

(D) TOPOLOGY: Circular 

(ii) MOLECULE TYPE: cDNA to genomic RNA 
25 (iii) HYPOTHETICAL: No 

(iv) ANTI-SENSE: No 

(V) FRA<aiENT TYPE: N/A 

(Vi) OSX6INAL SOURCE: 

(A) ORGANISM: Tobacco Etch Virus 
30 (B) STRAIN: Highly Aphid Tremsmitted 

(C) INDIVIDUAL ISOLATE: N/A 
(Vii) IMMEDIATE SOURCE: 

(A) LIBRARY: No 

(B) CLONE: pTC:FL 
35 (Viii) POSITION IN GENOME: N/A 

(ix) FEATURE: 

(A) NAME/KEY: Mutations (AGT-*ATG) 
introduced into nucleotides 
corresponding to genomic nucleotides 

40 8518-8520 of SEQ ID No. 1, to create 

initiating methionine codon. 

(B) LOCATION: Nucleotides 1-3 of SEQ 
ID No. 2 

(C) IDENTIFICATION METHOD: — 

45 (D) OTHER INFORMATION: SEQ ID NO: 2 is 

the modified Tobacco Etch Virus coat 
protein gene present in pTC:FL. 
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(X) PUBLICATION INFORMATION: 

(A) AUTHORS: Allison et al. 

(B) TITLE: The nucleotide sequence of the 
coding region of Tobacco Etch Virus 

5 . Genomic RNA: Evidence for the 

Synthesis of a Single Polyprotein 

(C) JOURNAL: Virology 
(0) VOLUME: 154 

(E) ISSUE: — 
10 (F) PAGES: 9-20 

(A) AUTHORS: Lindbo and Dougherty 

(B) TITLE: Untranslatable Transcripts of 
the tobacco etch virus coat protein 

15 gene sequence can interfere with 

tobacco etch virus replication in 
Transgenic Plants and Protoplasts 

(C) JOURNAL: Virology 

(D) VOLUME: 189 
20 (E). ISSUE: — 

(F) PAGES: 725-733 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



25 



30 



AT6 6GC ACT 9 

Met Gly Thr 
1 

GT6 GAT GOT G6T GCT GAG GCT GGT AAG AAG AAA GAT CAA AAG GAT GAT 57 
Val Asp Ala Gly Ala Asp Ala Gly Lys Lys Lys Asp Gin Lys Asp Asp 
5 10 15 

AAA GTC GCT GAG GAG GCT TCA AAG GAT AGG GAT GTT AAT GCT GGA ACT 105 
Lys Val Ala Glu Gin Ala Ser Lys Asp Arg Asp Val Asn Ala Gly Thr 
20 25 30 35 

35 TCA GGA ACA TTC TCA GTT CCA CGA ATA AAT GCT ATG GCC ACA AAA CTT 153 
Ser Gly Thr Phe Ser Val Pro 20:g lie Asn Ala Met Ala Thr Lys Leu 

40 45 50 

CAA TAT CCA AGG ATG AGG GGA GAG GTG GTT GTA AAC TTG AAT CAC CTT 201 

40 Gin Tyr Pro Arg Met Arg Gly Glu Val Val Val Asn Leu Asn His Leu 

55 60 65 

TTA GGA TAC AAG CCA CAG CAA ATT GAT TTG TCA AAT GCT CGA GCC ACA 249 

Leu Gly Tyr Lys Pro Gin Gin lie Asp Leu Ser Asn Ala Arg Ala Thr 
45 70 75 80 

CAT GAG CAG TTT GCC GCG T6G CAT CAG GCA GTG ATG ACA GCC TAT GGA 297 

His Glu Gin Phe Ala Ala Trp His Gin Ala Val Met Thr Ala Tyr Gly 
85 90 95 



50 



GTG AAT 6AA GAG CAA ATG AAA ATA TTG CTA AAT GGA TTT ATG GTG TGG 345 
Val Asn Glu Glu Gin. Met Lys He Leu Leu Asn Gly Phe Met Val Trp 
100 105 110 115 



55 TGC ATA GAA AAT GGG ACT TCC CCA AAT TTG AAC GGA ACT TGG GTT ATG 
Cys He Glu Asn Gly Thr Ser Pro Asn Leu Asn Gly Thr Trp Val Met 

120 125 130 



393 
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ATG GAT GGT GAG GAT CAA 6TT TCA TAG CCG CTG AAA CCA ATG GTT GAA 441 

Met Asp Cly Glu AffD Gin Val Ser Tyr Pro Leu Lys Pro Met Val Glu 

135 * 140 145 

5 AAC GCG CAG CCA ACA CTG AGG CAA ATT ATG ACA CAC TTC AGT GAC CTG 489 
Asn Ala Gin Pro Thr Leu Arg Gin lie Met Thr His Phe Ser Asp Leu 
150 155 160 

GCT GAA GCG TAT ATT GAG ATG AGG AAT AGG GAG CGA CCA TAC ATG CCT 537 

10 Ala Glu Ala Tyr lie Glu Met Arg Asn Arg Glu Arg Pro Tyr Met Pro 
165 170 175 

AGG TAT GGT CTA CAG AGA AAC ATT ACA GAC ATG AGT TTG TCA CGC TAT 585 
Arg Tyr 61y Leu Gin Arg Asn lie Thr Asp Met Ser Leu Ser Arg Tyr 
15 180 185 190 195 

GCG TTC GAC TTC TAT GAG CTA ACT TCA AAA ACA CCT GTT AGA GCG AGG 633 

Ala Phe Asp Phe Tyr Glu Leu Thr Ser Lys Thr Pro Val Arg Ala Arg 

200 205 210 

20 

GAG GCG CAT ATG CAA ATG AAA GCT GCT GCA GTA CGA AAC AGT GGA ACT 681 
Glu Ala His Met Gin Met Lys Ala Ala Ala Val Arg Asn Ser Giy Thr 

215 220 225 

25 AGG TTA TTT GGT CTT GAT GGC AAC GTG GGT ACT GCA GAG GAA GAC ACT 729 
Arg Leu Phe Giy Leu Asp Giy Asn Val Giy Thr Ala Glu Glu Asp Thr 
230 235 240 

GAA C6G CAC ACA GCG CAC GAT GTG AAC CGT AAC ATG CAC ACA CTA TTA 777 

30 Glu Arg His Thr Ala His Asp Val Asn Arg Asn Met His Thr Leu Leu 
245 250 255 

GGG GTC CGC CAG TGA "^^2 
Giy Val Arg Gin 
35 260 

(4) INFORMATION FOR SEQ ID NO: 3: 
(1) SEQUENCE CHARACTERISTICS: 

40 . (A) LENGTH: 793 

(B) TYPE: Nucleic Acid 

( C) STRANDEDNESS : Double 

(D) TOPOLOGY: Circular 

(ii) MOLECULE TYPE: cDNA to genomic RNA 

45 (iii) HYPOTHETICAL: No 

(iv) ANTI-SENSE: No 

(V) • FRAGMENT TYPE: N/A 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Tobacco Etch Virus 
50 (B) STRAIN: Highly Aphid Transmitted 

(C) INDIVIDUAL ISOLATE: N/A 

(vii) lUMEDIATE SOURCE: 

(A) LIBRARY : No 

(B) CLONE: pTC:RC 
55 (viii) POSITION IN GENOME: N/A 



(Ix) FEATURE: 

(A) NAME/KEY: Mutation of AGT-GGC 
(Ser-Gly) to ATG-GCC (Met-Ser) 

(B) LOCATION: Nucleotides 1-6 of SEQ 

ID NO. 3 (corresponding to nucleotides 

8518-8523 of SEQ ID NO. 1) 

(A) NAME/KEY: Frameshift mutation 
(insertion of T) producing stop codon 

(B) LOCATION: Nucleotide 13 of SEQ ID 
No. 3 (corresponding to position 
between nucleotides 8529 and 8530 of 
SEQ. ID No. 1) 

(D) OTHER INFORMATION: SEQ ID No: 3 is 
the modified Tobacco Etch Virus coat 
protein gene present in pTC:RC. 

(X) PUBLICATION INFORMATION: 

(A) AUTHORS: J. A. Lindbo and 
W. G. Dougherty 

(B) TITLE: Pathogen-Derived Resistance to 
a Potyvirus: Immune and Resistant 
Phenotypes in Transgenic Tobacco 
Expressing Altered Forms of a 
Potyvirus Coat Protein Nucleotide 
Sequence 

(C) JOURNAL: Molecular Plant-Microbe 
Interactions 

(D) VOLUME: 5 

(E) ISSUE: 2 

(F) PAGES: 144-153 

(A) AUTHORS: J. A. Lindbo and 
W. G. Dougherty 

(B) TITLE: Untranslatable Transcripts of 
the Tobacco Etch Virus Coat Protein 
Gene Sequence Can Interfere with 
Tobacco Etch Virus Replication in 
Transgenic Plants and Protoplasts 

(C) JOURNAL: Virology 

(D) VOLUME: 189 

(E) ISSUE: — 

(F) PAGES: 725-73 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATG GCC ACT 9 
Het Ser Thr 



GTG T6A T6A TG6TGCTAGC GCTGGTAAGA AGAAAGATCA AAAGGATGAT 
Val 



58 
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20 







- 44 - 






A&A6TC6CT6 AGC2kGGCTTC 


AAAGGATAGG 


GATGTTAATG 


CTGGAAcnrrc 


108 


AGGJiACAIIC TCftGTTCCAC 


GAATAAATGC 


TATGGCCACA 


aaXcztcaat 


158 


ATCCAAGGAX GA66GGAGAG 


GTGGTTGTAA 


AC7TGAATCA 


CCXTXTAGGA 


208 


• 

7ACAA6CCAC AGCAAAXTGA 


^TTGTCAAAT 


GCTCGAGCCA 


CACAIGAGCA 


258 


GTTTGCCGCG TGGCATCAGG 


CAGTGATGAC 


A6CCTAT6GA 


GTGAATGAAG 


308 


AGC2^TGAA AASASTGCTA 


AATGGATTTA 


7GGTGXGGXG 


CATAGAAAAT 


358 


GGGACTTCCC CAAAITTGAA 


CGGAACTTGG 


GXTATGATGG 


ATGGTGAGGA 


408 


TCAAGTTTCA TACCC6CX6A 


AACCAAX6GT 


TGAAAACGOG 


CAGCCAACAC 


458 


TGA66CAAAZ TAX6ACACAC 


TTCAGTGACC 


T6GCTGAAGC 


GTATAXTGAG 


508 


ATGAGGAATA GG6A6CGACC 


A7ACATGCCT 


AGGTATGGTC 


TACAGAGAAA 


558 


CATT2VCA6AC ATGA6TTTGT 


CACGCTATGC 


GTTOGACTTC 


TATGAGCTAA 


608 


CTTCA2kAAAC ACCTGTTAGA 


GCGAGGGAGG 


CGCATAXGCA 


AATGAAAGCT 


658 




aaWXAuvX XA 




ATGGCAAC6T * 


708 


GGGTACT6CA GAGGAAGACA 


CTGAACGGCA 


CACAGCGGAC 


GATGTGAACC 


758 


GTAACATGCA CACACTATTA 


6GGGTCCGCC 


AGTGA 




793 



30 

(5) INFOBH&TXON FOR SEQ ID NO: 4 

(i) SEQUENCE CHAH&CTERISTICS : 

(A) LENGTH: 792 

35 (B) TYPE: Nucleic acid 

(C) STRANDEDNESS : Double 

(D) TOPOLOGY: Circular 

(ii) MOLECULE TYPE: cDNA to genomic RNA 

(iii) HYPOTHETICAL: No 
40 (iv) ANTI-SENSE: Yes 

(V) FRAGMENT TYPE: N/A 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Tobacco Etch Virus 

(B) STRAIN: Highly Aphid Transmitted 

45 (C) INDIVIDUAL ISOLATE: N/A 

(Vii) IMMEDIATE SOURCE: 

(A) LIBRARY: No 
CB) CLONE: pTC:AS 
(Viii) POSITION IN GENOME: N/A 

50 (ix) FEATURE: 

(A) NAME/KEY: — 

(B) LOCATION: — 

(C) IDENTIFICATION METHOD: — 
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(D) OTHER INFORMATION: SEQ ID No. 4 is the 
modified Tobacco Etch Virus Coat protein 
gene present in pTC:AS. It is the 
inverse complement of SEQ ID No* 2. 

5 (X) PUBLICATION INFORMATION: 

(A) AUTHORS: J. A. Lindbo and 
W . G • Dougherty 

(B) TITLE: Untranslatable Transcripts of 
the Tobacco Etch Virus Coat Protein Gene 

10 Sequence Can Interfere with Tobacco Etch 

Virus Replication in Transgenic Plants 
and Protoplasts 

(C) JOURNAL: Virology 

(D) VOLUME: 189 
15 (E) ISSUE: — 

(F) PAGES: 725-733 

(A) AUTHORS: J. A. Lindbo and 
W, G. Dougherty 

20 (B) TITLE: Pathogen-Derived Resistance to a 

Potyvirus: Immune and Resistant 
Phenotypes in Transgenic Tobacco 
Expressing Altered Forms of a Potyvirus 
Coat Protein Nucleotide Sequence 

25 (C) JOURNAL: Molecular Plant-Microbe 
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(D) VOLUME: 5 

(E) ISSUE: 2 

(F) PAGES: 144-153 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

TCACtGCCGG ACCCCTA&TA CT5TCTCCAT GTTACGCTTC ACATCCTCCG CTGTCTGCCC 60 

TTCAGTCTCT TCCTCTGCAC TACCCACCTT GCCATCAAGA CCAAATAACC TAGTTCCACT 120 

6TTTCCTACT CCAGCACCTT TCATTTGCAT ATGCCCCTCC CTCCCTCTAA CAGGTGTTTr 180 

TGAAGTTAGC 7CATAGAAGT CCAACCCA7A GCGTGACAAA C7CATGTC7C TAA7GT77C7 240 

CTGTAGACCA TACCTAGCCA TGXATCGTCC CTCCCTATTC CTCAXCTCAA TATACGCTTC 300 

AGCCAGGTCA CXCAAG7CTG 7CAXAATTTG CC7CAGTC77 GCC7GCGCG7 777CAACCA7 360 

7GG777CAGC GGC7A7CAAA C77GA7CC7C ACCATCCA7C A7AACCCAAG 77CCG77CAA 420 

A77TGGGCAA G7CCCA7777 C7A7GCACCA CACCATAAA7 CCAT77ACCA A7AT777CA7 480 

77GC7C77CA 77CAC7CCA7 AGGC7S7CA7 CACXCCC7CA 7GCCACCCGG CAAAC7CC7C 540 

A7G7G7GGC7 CGAGCA7r7G ACAAA7CAA7 77CC7G7GGC 77G7A7CC7A AAACG7GA7r 600 

CAACrrXACA ACCACCrC7C CCC7CA7C:C7 7GCA7A77GA AG7777G7GG CCA7AGCA77 660 

7A7TCC7CGA AC7CAGAATG T7CS7GAAG7 7CCACCA77A ACA7CCC:7A7 CC777GAAGC 720 
C7GCrCACCG AC777A7CA7 crrr77aA7C 7TTCTTC77A CCAGCC7CAG CACCAGCA7C 
CACA67GCCC: AT 



780 
792 
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CIAIMS 

1. A plant-transformation vector comprising a 
DNA molecule that includes a gene derived, in part, from 
a plant vinxs RNA molecule, wherein the gene is mutated 
to encode an untranslatable plus sense KNA molecixle. 

2. The vector of claim l wherein the gene is 
derived, in part, from potyvirus ENA. 

3. The vector of claim 2 wherein the potyvirus 
is Tobacco Etch Virus. 

A. The vector of claim 2 wherein the gene is 
derived, in part, from a coat protein gene of a 
potyvirus . 

5. The vector of claim 4 wherein the gene is 
derived, in part, from the coat protein gene of Tobacco 
Etch Virus. 

6. A bacterial cell containing the vector of 

claim 1. 

7. The bacterial cell of claim 8 wherein the 
bacterial cell is an Agrobacterlum tumefaciens cell. 

8. A transformed plant cell comprising a 
heterologous DNA chromosomal insert that includes a gene 
derived from a plant virus ENA molecule, wherein the 
gene is mutated to encode an untranslatable plus sense 
SNA molecule. 

9. The plant cell of claim 8 wherein the gene 
is derived from potyvirus SNA. 

10. The plant cell of claim 9 wherein the 
potyvirus is Tobacco Etch Virus. 

11. The plant cell of claim 10 wherein the 
gene is derived from a coat protein gene of a potyvirus. 

12. The plant cell of claim 10 wherein the 
gene is derived from the coat protein gene of Tobacco 
Etch Virus and the plant cell is a tobacco plant cell. 

13. A differentiated plant comprising 
transformed plant cells of claim 8. 

14. A differentiated plant comprising 
transformed plant cells of claim 9. 



15. A differentiated plant comprising 
transformed plant cells of claim 10. 

16. A differentiated plant comprising 
transformed plant cells of claim 11. 

17. A differentiated plant comprising 
transformed plant cells of claim 12. 

18. A recombinant gene comprising: 
control regions which regulate transcription of the 
gene ; and 

a region^ derived from a plant virus, 
mutated so as to render the RNA transcribed from the 
gene untranslatable. 

19. The recombinant gene of claim 18 wherein 
the plant virus is a potyvirus. 

20. The recombinant gene of claim 19 wherein 
the virus-derived region is derived from the region of 
the viral genome encoding a coat protein. 

21. The recombinant gene of claim 20 wherein 
the potyvirus is Tobacco Etch Virus. 

22. A method of producing plants with a 
reduced susceptibility to viral infection^ comprising: 

forming a recombinant gene derived, in 
part, from viral RNA wherein the gene is mutated to 
encode an untranslatable plus sense RNA molecule; and 

transforming plants with the recombinant 

gene . 

23. The method of claim 22 wherein the method 
of producing plants comprises: 

constructing a recombinant gene comprising 
a region of a viral genome capable of being transcribed 
in a plant; 

mutating the recombineuit gene to encode an 
untranslatable plus sense RNA molecule; 

cloning the recombinant untranslatable 
gene into a plant transformation vector; 

transforming plant cells .with the 
transformation vector; and 
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culturing transformed cells under 
conditions suitable for regeneration of transformed 
plants • 

24. The method of claim 23 wherein the viral 
5 genome is a potyvirus genome. 

25. The method of claim 24 wherein the region 
of the viral genome encodes a coat protein. 

26. The method of claim 25 wherein the viral 
genome is the Tobacco Etch Virus genome. 

10 27. The method of claim 26 wherein the plsmts 

are tobacco plants. 
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NAAATAACAA ATCTCAACAC AACATATACA AAACAAACGA ATCTCAAGCA ATCAAGCATT 60 

CTACTTCTAT TGCAGCAATT TAAATCATTT CTTTTAAAGC AAAAGCAATT TTCTGAAAAT 120 

TTTCACCATT TACGAACGAT AGCA ATG GCA CTG ATC TTT GGC AC A GTC AAC GCT 174 

Met Ala Leu He Phe Gly Thr Val Asn Ala 
15 10 

AAC ATC CTG AAG GAA GTG TTC GGT GGA GCT CGT ATG GCT TGC GTT ACC 222 

Asn He Leu Lys Glu Val Phe Gly Gly Ala Arg Met Ala Cys Val Thr 

15 20 25 

AGC GCA CAT ATG GCT GGA GCG AAT GGA AGC ATT TTG AAG AAG GCA GAA 270 
Ser Ala His Met Ala Gly Ala Asn Gly Ser He Leu Lys Lys Ala Glu 

30 35 40 

GAG ACC TCT CGT GCA ATC ATG CAC AAA CCA GTG ATC TTC GGA GAA GAC 318 

Glu Thr Ser Arg Ala He. Met His Lys Pro Val He Phe Gly Glu Asp 
45 50 55 

TAC ATT ACC GAG GCA GAC TTG CCT TAC ACA CCA CTC CAT TTA GAG GTC 366 
Tyr He Thr Glu Ala Asp Leu Pro Tyr Thr Pro Leu His Leu Glu Val 
60 65 70 

GAT GCT GAA ATG GAG CGG ATG TAT TAT CTT GGT CGT CGC GCG CTC ACC 414 
Asp Ala Glu Met Glu Arg Met Tyr Tyr Leu Gly Arg Arg Ala Leu Thr 
75 80 85 90 

CAT GGC AAG AGA CGC AAA GTT TCT GTG AAT AAC AAG AGG AAC AGG AGA 462 
His Gly Lys Arg Arg Lys Val Ser Val Asn Asn Lys Arg Asn Arg Arg 

95 100 105 

AGG AAA GTG GCC 2VAA ACG TAC GTG GGG CGT GAT TCC ATT GTT GAG AAG 510 

Arg Lys Val Ala Lys Thr Tyr Val Gly Arg Asp Ser He Val Glu Lys 

110 115 120 

ATT GTA GTG CCC CAC ACC GAG AGA AAG GTT GAT ACC ACA GCA GCA GTG 558 
He Val Val Pro His Thr Glu Arg Lys Val Asp Thr Thr Ala Ala Val 
125 130 135 

GAA GAC ATT TGC AAT GAA GCT ACC ACT CAA CTT GTG CAT AAT AGT ATG 606 
Glu Asp lie Cys Asn Glu Ala Thr Thr Gin Leu Val His Asn Ser Met 
140 145 150 

CCA AAG CGT AAG AAG CAG AAA AAC TTC TTG CCC GCC ACT TCA CTA AGT 654 
Pro Lys Arg Lys Lys Gin Lys Asn Phe Leu Pro Ala Thr Ser Leu Ser 
155 160 165 170 

AAC GTG TAT GCC CAA ACT TGG AGC ATA GTG CGC AAA CGC CAT ATG CAG 702 

Asn Val Tyr Ala Gin Thr Trp Ser He Val Arg Lys Arg His Met Gin 

175 180 185 

GTG GAG ATC ATT AGC AUG AAG AGC GTC CGA GCG AGG GTC AAG AGA TTT 750 
Val Glu He He Ser Lys Lys Ser Val Arg Ala Arg Val Lys Arg Phe 

190 195 200 

GAG GGC TCG GTG CAA TTG TTC GCA AGT GTG CGT CAC ATG TAT GGC GAG 798 

Glu Gly Ser Val Gin Leu Phe Ala Ser Val Arg His Met Tyr Gly Glu 
205 210 215 

AGG AAA AGG GTG GAC TTA CGT ATT GAC AAC TGG CAG CAA GAG ACA CTT 846 
Arg Lys Arg Val Asp Leu Arg He Asp Asn Trp Gin Gin Glu Thr Leu 
220 225 230 
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CTA GAC CTT GCT AAA AGA TTT AAG AAT GAG AG A GTG GAT CAA TCG AAG 894 

Leu Asp Leu Ala Lys Arg Phe Lys Asn Glu Arg Val Asp Gin Ser Lys 
235 240 245 250 

CTC ACT TTT GGT TCA AGT GGC CTA GTT TTG AGG CAA GGC TCG TAC GGA 942 
Leu Thr Phe Gly Ser Ser Gly Leu Val Leu Arg Gin Gly Ser Tyr Gly 

255 260 265 

CCT GCG CAT TGG TAT CGA CAT GGT ATG TTC ATT GTA CGC GGT CGG TCG 990 
Pro Ala His Trp Tyr Arg His Gly Met Phe lie Val Arg Gly Arg Ser 

270 275 280 

GAT GGG ATG TTG GTG GAT GCT CGT GCG AAG GTA ACG TTC GCT GTT TGT 1038 
Asp Gly Met Leu Val Asp Ala Arg Ala Lys Val Thr Phe Ala Val Cys 
285 290 295 

CAC TCA ATG ACA CAT TAT AGC GAC AAA TCA ATC TCT GAG GCA TTC TTC 1086 
His Ser Met Thr His Tyr Ser Asp Lys Ser lie Ser Glu Ala Phe Phe 
300 305 310 

ATA CCA TAC TCT AAG AAA TTC TTG GAG TTG AGA CCA GAT GGA ATC TCC 1134 
lie Pro Tyr Ser Lys Lys Phe Leu Glu Leu Arg Pro Asp Gly He Ser 
315 320 325 330 

CAT GAG TGT ACA AGA GGA GTA TCA GTT GAG CGG TGC GGT GAG GTG GCT 1182 
His Glu Cys Thr Arg Gly Val Ser Val Glu Arg Cys Gly Glu Val Ala 

335 340 345 

GCA ATC CTG ACA CAA GCA CTT TCA CCG TGT GGT AAG ATC ACA TGC AAA 1230 
Ala He Leu Thr Gin Ala Leu Ser Pro Cys Gly Lys He Thr Cys Lys 

350 355 360 

CGT TGC ATG GTT GAA ACA CCT- GAC ATT GTT GAG GGT GAG TCG GGA GAA 1278 
Arg Cys Met Val Glu Thr Pro Asp He Val Glu Gly Glu Ser Gly Glu 
365 370 375 

AGT GTC ACC AAC CAA GGT AAG CTC CTA GCA ATG CTG AAA GAA CAG TAT 1326 

Ser Val Thr Asn Gin Gly Lys Leu Leu Ala Met Leu Lys Glu Gin Tyr 
360 385 390 

CCA GAT TTC CCA ATG GCC GAG AAA CTA CTC ACA AGG TTT TTG CAA CAG 1374 
Pro Asp Phe Pro Met Ala Glu Lys Leu Leu Thr Arg Phe Leu Gin Gin 
395 400 405 410 

AAA TCA CTA GTA AAT ACA AAT TTG ACA GCC TGC GTG AGC GTC AAA CAA 1422 
Lys Ser Leu Val Asn Thr Asn Leu Thr Ala Cys Val Ser Val Lys Gin 

415 420 425 

CTC ATT GGT GAC CGC AAA CAA GCT CCA TTC ACA CAC GTA CTG GCT GTC 1470 

Leu He Gly Asp Arg Lys Gin Ala Pro Phe Thr His Val Leu Ala Val 

430 435 440 

AGC GAA ATT CTG TTT AAA GGC AAT AAA CTA ACA GGG GCT GAT CTC GAA 1518 
Ser Glu He Leu Phe Lys Gly Asn Lys Leu Thr Gly Ala Asp Leu Glu 
445 450 455 

GAG GCA AGC ACA CAT ATG CTT GAA ATA GCA AGG TTC TTG AAC AAT CGC 1566 
Glu Ala Ser Thr His Met Leu Glu He Ala Arg Phe Leu Asn Asn Arg 
460 465 470 

ACT GAA AAT ATG CGC ATT GGC CAC CTT GGT TCT TTC AGA AAT AAA ATC 1614 

Thr Glu Asn Met Arg He Gly His Leu Gly Ser Phe Arg Asn Lys He 
475 480 485 490 
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TCA TCG AAG GCC CAT GTG AAT AAC GCA CTC ATG TGT GAT AAT CAA CTT .1662 

Ser Ser Lys Ala His Val Asn Asn Ala Leu Met Cys Asp Asn Gin Leu 

495 500 505 

* 

GAT CAG AAT GGG AAT TTT ATT TGG GGA CTA AGG GGT GCA CAC GCA AAG 1710 
Asp Gin Asn Gly Asn Phe lie Trp Gly Leu Arg Gly Ala His Ala Lys 

510 515 520 

AGG TTT CTT AAA GGA TTT TTC ACT GAG ATT GAC CCA AAT GAA GGA TAC 1758 
Arg Phe Leu Lys Gly Phe Phe Thr Glu lie Asp Pro Asn Glu Gly Tyr 
525 530 535 

GAT AAG TAT GTT ATC AGG AAA CAT ATC AGG GGT AGC AGA AAG CTA GCA 1806 
Asp Lys Tyr Val lie Arg Lys His lie Arg Gly Ser Arg Lys Leu Ala 
540 545 550 

ATT GGC AAT TTG ATA ATG TCA ACT GAC TTC CAG ACG CTC AGG CAA CAA 1854 

lie Gly Asn Leu lie Met Ser Thr Asp Phe Gin Thr Leu Arg Gin Gin 
555 560 565 570 

ATT CAA GGC GAA ACT ATT GAG CGT AAA GAA ATT GGG AAT CAC TGC ATT 1902 
lie Gin Gly Glu Thr He Glu Arg Lys Glii He Gly Asn His Cys He 

575 580 585 

TCA ATG CGG AAT GGT AAT TAC GTG TAC CCA TGT TGT TGT GTT ACT CTT 1950 
Ser Met Arg Asn Gly Asn Tyr Val Tyr Pro Cys Cys Cys Val Thr Leu 

590 595 600 

GAA GAT GGT AAG GCT CAA TAT TCG GAT CTA AAG CAC CCA ACG AAG AGA 1998 
Glu Asp Gly Lys Ala Gin Tyr Ser Asp Leu Lys His Pro Thr Lys Arg 
605 610 615 

CAT CTG GTC ATT GGC AAC TCT GGC GAT TCA AAG TAC CTA GAC CTT CCA 2046 

His Leu Val He Gly Asn Ser Gly Asp Ser Lys Tyr Leu Asp Leu Pro 
620 625 630 

GTT CTC AAT GAA GAG AAA ATG TAT ATA GCT AAT GAA GGT TAT TGC TAC 2094 
Val Leu Asn Glu Glu Lys Met Tyr He Ala Asn Glu Gly Tyr Cys Tyr 
635 640 645 650 

ATG AAC ATT TTC TTT GCT CTA CTA GTG AAT GTC AAG GAA GAG GAT GCA 2142 
Met Asn He Phe Phe Ala Leu Leu Val Asn Val Lys Glu Glu Asp Ala 

655 660 665 

AAG GAC TTC ACC AAG TTT ATA AGG GAC ACA ATT GTT CCA AAG CTT GGA 2190 

Lys Asp Phe Thr Lys Phe He Arg Asp Thr He Val Pro Lys Leu Gly 

670 675 680 

GC6 TGG CCA ACA ATG CAA GAT GTT GCA ACT GCA TGC TAC TTA CTT TCC 2238 
Ala Trp Pro Thr Met Gin Asp Val Ala Thr Ala Cys Tyr Leu Leu Ser 
685 690 695 

ATT CTT TAC CCA GAT GTC CTG AGA GCT GAA CTA CCC AGA ATT TTG GTT 2286 
He Leu Tyr Pro Asp Val Leu Arg Ala Glu Leu Pro Arg He Leu Val 
700 705 710 

GAT CAT GAC AAC AAA ACA ATG CAT GTT TTG GAT TCG TAT GGG TCT AGA 2334 
Asp His Asp Asn Lys Thr Met His Val Leu Asp Ser Tyr Gly Ser Arg 
715 720 725 730 

ACG ACA GGA TAC CAC ATG TTG AAA ATG AAC ACA ACA TCC CAG CTA ATT 2382 

Thr Thr Gly Tyr His Met Leu Lys Met Asn Thr Thr Ser Gin Leu He 

735 740 745 
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GAA TTC 6TT 
Glu Phe Val 



GGA G66 AT6 
Gly Gly Met 
765 

ATC AAG TCT 

lie Lys Ser 
780 

GAG CCA TAC 
Glu Pro Tyr 
795 

GCC ATG TAC 

Ala Met Tyr 



CCA AAT ACA 
Pro Asn Thr 



GCG CAA AAG 

Ala Gin Lys 

845 

ATT AAT GAG 
lie Asn Glu 
860 

AGG GTT AAT 
Arg Val Asn 
875 

CT6 GCC ACC 
Leu Ala Thr 



6TG ACC TCT 

Val Thr Ser 



GCT TtG AAG 
Ala Leu Lys 
925 

GCA ATC AGG 
Ala lie Arg 
940 

ATC ATG AAA 
lie Met Lys 
955 

AAA TCG CTT 

Lys Ser Leu 



AGA GCC GTA 

Arg Ala Val 



CAT TCA GGT 

His Ser Gly 
750 

AAC CGA GAT 
Asn Arg Asp 



ATA TAC AAA 
lie Tyr Lys 



ATA ATT GTC 

He He Val 
800 

AAC TCT GGA 

Asn Ser Gly 
815 

ATG AGG TTA 
Met Arg Leu 
830 

TTA ACT TTG 

Leu Thr Leu 



TAT GCG CAG 
Tyr Ala Gin 



CAT TCG CTA 
His Ser Leu 
880 

CAA GAG ATG 
Gin Glu Met 
895 

GAA AAG GTG 
Glu Lys Val 
910 

GAT GCA TGG 
Asp Ala Trp 



CAT TCA AGA 
His Ser Arg 



AAC ACC GTA 
Asn Thr Val 
960 

TTC AAG TTC 

Phe Lys Phe 
975 

AAT GGT GGC 
Asn Gly Gly 
990 



TTG GAA TCC 

Leu Glu Ser 
755 

GTG GTC ACA 
Val Val Thr 
770 

CCA CAT CTC 
Pro His Leu 
785 

CTG GCA ATA 

Leu Ala He 



ACT TTT GAG 

Thr Phe Glu 



GCT AAC CTC 

Ala Asn Leu 
835 

GCA GAT TTG 

Ala Asp Leu 

850 

GTA ATT TTG 
Val He Leu 
865 

TCC CTA GCA 
Ser Leu Ala 



GAC ATG GCG 
Asp Met Ala 



CAT GAA ATG 
His Glu Met 
915 

GAC GAA TTA 
Asp Glu Leu 
930 

AAG CTC TTG 
Lys Leu Leu 
945 

GAT TGC GGC 
Asp Cys Gly 



CAC TTG GAA 
His Leu Glu 



GCA AGA AAG 
Ala Arg Lys 
995 



GAA ATG AAA 
Glu Met Lys 



CAA GGT GCA 
Gin Gly Ala 



ATG AAG CAG 
Met Lys Gin 
790 

GTC TCC CCT 
Val Ser Pro 
805 

CAG GCG TTA 

Gin Ala Leu 
820 

GCT GCC ATC 

Ala Ala He 



TTC GTC CAG 

Phe val Gin 



GAC AAT CTG 
Asp Asn Leu 
870 

ATG GAA ATT 
Met Glu He 
885 

TTG AGG GAA 
Leu Arg Glu 
900 

TTG GAA AAA 
Leu Glu Lys 



ACT TGG TTG 
Thr Trp Leu 



AAA TTT GGG 
Lys Phe Gly 
950 

GGA CAT ATA 
Gly His He 
965 

CTC CTG AAG 

Leu Leu Lys 
980 

GTA AGA GTA 

Val Arg Val 



ACT TAC AAT 

Thr Tyr Asn 
760 

ATT GAG ATG 
He Glu Met 
775 

TTA CTT GAG 
Leu Leu Glu 



TCA ATT TTA 
Ser He Leu 



CAA ATG TGG 

Gin Met Trp 

825 

TTG TCA GCC 

Leu Ser Ala 
840 

CAG CGT AAT 

Gin Arg Asn 

855 

ATT GAC GGT 
He Asp Gly 



GTT ACT ATT 
Val Thr He 



GGT GGC TAT 
Gly Gly Tyr 
905 

AAC TAT GTA 

Asn Tyr Val 
920 

GAA AAA TTC 
Glu Lys Phe 
935 

CGA AAG CCT 
Arg Lys Pro 



GAC TTG TCT 
Asp Leu Ser 



GGA ACC ATC 

Gly Thr He 
985 

GCG AAG AAT 
Ala Lys Asn 
1000 



GTT 2430 
Val 



TTG 2478 
Leu 



GAA 2526 
Glu 



ATT - 2574 

He 

810 

TTG 2622 
Leu 



TTA 2670 
Leu 



TTG 2718 

Leu 



GTC 2766 
Val 



AAG 2814 

Lys 

890 

GCT 2862 
Ala 



AAG 2910 
Lys 



TCC 2958 
Ser 



TTA 3006 
Leu 



GTG 3054 

Val 

970 

TCA 3102 
Ser 



GCC 3150 
Ala 
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ATG ACA AAA GGG GTT TTT CTC AAA ATC TAG AGO ATG CTT OCT GAG GTC 3198 
Met Thr Lys Gly Val Phe Leu Lys He Tyr Ser Met Leu Pro Asp Val 
1005 1010 1015 

TAG AAG TTT ATC ACA GTC TCG AGT GTC CTT TCC TTG TTG TTG ACA TTC 3246 
Tyr Lys Phe He Thr Val Ser Ser Val Leu Ser Leu Leu Leu Thr Phe 
1020 1025 1030 

TTA TTT.CAA ATT GAG TGC ATG ATA AGG GCA CAC CGA GAG GCG AAG GTT 3294 
Leu Phe Gin He Asp Cys Met He Arg Ala His Arg Glu Ala Lys Val 
1035 1040 1045 1050 

GCT GCA CAG TTG CAG AAA GAG AGC GAG TGG GAC AAT ATC ATC AAT AGA 3342 
Ala Ala Gin Leu Gin Lys Glu Ser Glu Trp Asp Asn He He Asn Arg 

1055 1060 1065 

ACT TTC CAG TAT TCT AAG CTT GAA AAT CCT ATT GGC TAT CGC TCT ACA 3390 

Thr Phe Gin Tyr ser Lys Leu Glu Asn Pro He Gly Tyr Arg Ser Thr 

1070 1075 1080 

GCG GAG GAA AGA CTC CAA TCA GAA CAC CCC GAG GCT TTC GAG TAC TAG 3438 

Ala Glu Glu Arg Leu Gin Ser Glu His Pro Glu Ala Phe Glu Tyr Tyr 
1085 1090 1095 

AAG TTT TGC ATT GGA AAG GAA GAC CTC GTT GAA CAG GCA AAA CAA CCG 3486 
Lys Phe Cys He Gly Lys Glu Asp Leu Val Glu Gin Ala Lys Gin Pro 
1100 1105 1110 

GAG ATA GCA TAC TTT GAA AAG ATT ATA GCT TTC ATC ACA CTT GTA TTA 3534 
Glu He Ala Tyr Phe Glu Lys He He Ala Phe He Thr Leu Val Leu 
1115 1120 1125 1130 

ATG GCT TTT GAC GCT GAG CGG AGT GAT GGA GTG TTC AAG ATA CTC AAT 3582 
Met Ala Phe Asp Ala Glu Arg Ser Asp Gly Val Phe Lys He Leu Asn 

1135 1140 1145 

AAG TTC AAA GGA ATA CTG AGC TCA ACG GAG AGG GAG ATC ATC TAC ACG 3630 
Lys Phe Lys Gly He Leu Ser Ser Thr Glu Arg Glu He He Tyr Thr 

1150 1155 1160 

CAG AGT TTG GAT GAT TAC GTT ACA ACC TTT GAT GAC AAT ATG ACA ATC 3678 
Gin Ser Leu Asp Asp Tyr Val Thr Thr Phe Asp Asp Asn Met Thr He 
1165 1170 1175 

AAC CTC GAG TTG AAT ATG GAT GAA CTC CAC AAG ACG AGC CTT CCT GGA 3726 
Asn Leu Glu Leu Asn Met Asp Glu Leu His Lys Thr Ser Leu Pro Gly 
1180 1185 1190 

GTC ACT TTT AAG CAA TGG TGG AAC AAC CAA ATC AGC CGA GGC AAC GTG 3774 
Val Thr Phe Lys Gin Trp Trp Asn Asn Gin He Ser Arg Gly Asn Val 
1195 1200 1205 1210 

AAG CCA CAT TAT AGA ACT GAG GGG CAC TTC ATG GAG TTT ACC AGA GAT 3822 

Lys Pro His Tyr Arg Thr Glu Gly His Phe Met Glu Phe Thr Arg Asp 

1215 1220 1225 

ACT GCG GCA TCG GTT GGC AGC GAG ATA TCA CAC TCA CCC GCA AGA GAT 3870 
Thr Ala Ala Ser Val Ala Ser Glu He Ser His Ser Pro Ala Arg Asp 

1230 1235 1240 

TTT CTT GTG AGA GGT GCT GTT GGA TCT GGA AAA TCC ACA GGA CTT CCA 3918 

Phe Leu Val Arg Gly Ala Val Gly Ser Gly Lys Ser Thr Gly Leu Pro 
1245 1250 1255 
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TAG CAT TTA TCA AAG AG A GGG AG A GTG TTA ATG CTT GAG OCT ACC AGA 3966 
Tyr His Leu Ser Lys Arg Gly Arg Val Leu Met Leu Glu Pro Thr Arg 
1260 1265 1270 

CCA CTC ACA GAT AAC ATG CAC AAG CAA CTG AGA AGT GAA CCA TTT AAC 4014 
Pro Leu Thr Asp Asn Met His Lys Gin Leu Arg Ser Glu Pro Phe Asn 
1275 12B0 1285 1290 

TGC TTC CCA ACT TTG AGG ATG AGA GGG AAG TCA ACT TTT GGG TCA TCA 4062 
Cys Phe Pro Thr Leu Arg Met Arg Gly Lys Ser Thr Phe Gly Ser Ser 

1295 1300 1305 

CCG ATC ACA GTC ATG ACT AGT GGA TTC GCT TTA CAC CAC TTT GCA CGA 4110 
Pro lie Thr Val Met Thr Ser Gly Phe Ala Leu His His Phe Ala Arg 

1310 . 1315 1320 

AAC ATA GCT GAG GTA AAA ACA TAC GAT TTT GTC ATA ATT GAT GAA TGT 4158 
Asn lie Ala Glu Val Lys Thr Tyr Asp Phe Val lie lie Asp Glu Cys 
1325 1330 1335 

CAT GTG AAT GAT GCT TCT GCT ATA GCG TTT AGG AAT CTA CTG TTT GAA 4206 
His Val Asn Asp Ala Ser Ala lie Ala Phe Arg Asn Leu Leu Phe Glu 
1340 1345 1350 

CAT GAA TTT GAA GGA AAA GTC CTC AAA GTG TCA GCC ACA CCA CCA 6GT 4254 
His Glu Phe Glu Gly Lys Val Leu Lys Val Ser Ala Thr Pro Pro Gly 
1355 1360 1365 1370 

AGA GAA GTT GAA TTT ACA ACT CAG TTT CCC GTG AAA CTC AAG ATA GAA 4302 
Arg Glu Val Glu Phe Thr Thr Gin Phe Pro Val Lys Leu Lys lie Glu 

1375 1380 1385 

GAG GCT CTT AGC TTT CAG GAA TTT GTA AGT TTA CAA GGG ACA GGT GCC 4350 
Glu Ala Leu Ser Phe Gin Glu Phe Val Ser Leu Gin Gly Thr Gly Ala 

1390 1395 1400 

AAC GCC GAT GTG ATT AGT TGT GGC GAC AAC ATA CTA GTA TAT GTT GCT 4398 
Asn Ala Asp Val He Ser Cys Gly Asp Asn He Leu Val Tyr Val Ala 
1405 1410 1415 

AGC TAC AAT GAT GTT GAT AGT CTT GGC AAG CTC CTT GTG CAA AAG GGA 4446 
Ser Tyr Asn Asp Val Asp Ser Leu Gly Lys Leu Leu Val Gin Lys Gly 
1420 1425 1430 

TAC AAA GTG TCG AAG ATT GAT GGA AGA ACA ATG AAG AGT GGA GGA ACT 4494 
Tyr Lys Val Ser Lys He Asp Gly Arg Thr Met Lys Ser Gly Gly Thr 
1435 1440 1445 1450 

GAA ATA ATC ACT GAA GGT ACT TCA GTG AAA AAG CAT TTC ATA GTC GCA 4542 
Glu He He Thr Glu Gly Thr Ser Val Lys Lys His Phe He Val Ala 

1455 1460 1465 

ACT AAC ATT ATT GAG AAT GGT GTA ACC ATT GAC ATT GAT GTA GTT GTG 4590 

Thr Asn He He Glu Asn Gly Val Thr He Asp He Asp Val Val Val 

1470 . 1475 1480 

GAT TTT GGG ACT AAG GTT GTA CCA GTT TTG GAT GTG GAC AAT AGA GCG 4638 
Asp Phe Gly Thr Lys Val Val Pro Val Leu Asp Val Asp Asn Arg Ala 
1481 1490 1495 

GTG CAG TAC AAC AAA ACT GTG GTG AGT TAT GGG GAG CGC ATC CAA AAA 4686 

Val Gin Tyr Asn Lys Thr Val Val Ser Tyr Gly Glu Arg He Gin Lys 
1500 1505 1510 
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CTC GGT AGA GTT GGG CGA CAC AAG GAA GGA GTA OCA CTT CGA ATT GGC 

Leu Gly Arg Val Gly Arg His Lys Glu Gly Val Ala Leu Arg lie Gly 
1515 1520 1525 1530 

CAA ACA AAT AAA ACA CTG GTT GAA ATT CCA GAA ATG GTT GCC ACT GAA 
Gin Thr Asn Lys Thr Leu Val Glu lie Pro Glu Met Val Ala Thr Glu 

1535 1540 1545 ' 



4734 



4782 



GOT GCC TTT CTA TGC TTC ATG TAC AAT TTG CCA GTG ACA ACA CAG AGT 
Ala Ala Phe Leu Cys Phe Met Tyr Asn Leu Pro Val Thr Thr Gin Ser 

1550 1555 1560 



4830 



GTT TCA ACC ACA CTG CTG GAA AAT GCC ACA TTA TTA CAA GCT AGA ACT 
Val Ser Thr Thr Leu Leu Glu Asn Ala Thr Leu Leu Gin Ala Arg Thr 
1565 1570 1575 

ATG GCA CAG TTT GAG CTA TCA TAT TTT TAC ACA ATT AAT TTT GTG CGA 

Met Ala Gin Phe Glu Leu Ser Tyr Phe Tyr Thr lie Asn Phe Val Arg 
1580 1585 1590 



4878 



4926 



TTT GAT GGT AGT ATG CAT CCA GTC ATA CAT GAC AAG CTG AAG CGC TTT 
Phe Asp Gly Ser Met His Pro Val lie His Asp Lys Leu Lys Arg Phe 
1595 • 1600 1605 1610 



4974 



AAG CTA CAC ACT TGT GAG ACA TTC CTC AAT AAG TTG GCG ATC CCA AAT 
Lys Leu His Thr Cys Glu Thr Phe Leu Asn Lys Leu Ala lie Pro Asn 

1615 1620 1625 



5022 



AAA GGC TTA TCC TCT TGG CTT ACG AGT GGA GAG TAT AAG CGA CTT GGT 
Lys Gly Leu Ser Ser Trp Leu Thr Ser Gly Glu Tyr Lys Arg Leu Gly 

1630 1635 1640 



5070 



TAC ATA GGA GAG GAT GCT GGC ATA AGA ATC CCA TTC GTG TGC AAA GAA 
Tyr lie Ala Glu Asp Ala Gly lie Arg lie Pro Phe Val Cys Lys Glu 
1645 1650 1655 



5118 



ATT CCA GAC TCC TTG CAT GAG GAA ATT TGG CAC ATT GTA GTC GCC CAT 
lie Pro Asp Ser Leu His Glu Glu He Trp His He Val Val Ala His 
1660 1665 1670 



5166 



AAA GGT GAC TCG GGT ATT GGG AGG CTC ACT AGC GTA CAG GCA GCA AAG 
Lys Gly Asp Ser Gly He Gly Arg Leu Thr Ser Val Gin Ala Ala Lys 
1675 1680 1685 1690 



5214 



GTT GTT TAT ACT CTG CAA ACG GAT GTG CAC TCA ATT GCG AGG ACT CTA 
Val Val Tyr Thr Leu Gin Thr Asp Val His Ser He Ala Arg Thr Leu 

1695 1700 1705 



5262 



GCA TGC ATC AAT AGA CGC ATA GCA GAT GAA CAA ATG AAG CAG AGT CAT 

Ala Qys He Asn Arg Arg He Ala Asp Glu Gin Met Lys Gin Ser His 

1710 1715 1720 



5310 



TTT GAA GCC GCA ACT GGG AGA GCA TTT TCC TTC ACA AAT TAC TCA ATA 
Phe Glu Ala Ala Thr Gly Arg Ala Phe ser Phe Thr Asn Tyr Ser He 
1725 1730 1735 



5358 



CAA AGC ATA TTT GAC ACG CTG AAA GCA AAT TAT GCT ACA AAG CAT ACG 
Gin ser He Phe Asp Thr Leu Lys Ala Asn Tyr Ala Thr Lys His Thr 
1740 . 1745 1750 



5406 



AAA GAA AAT ATT GCA GTG CTT CAG CAG GCA AAA GAT CAA TTG CTA GAG 

Lys Glu Asn He Ala Val Leu Gin Gin Ala Lys Asp Gin Leu Leu Glu 
1755 1760 1765 1770 



5454 
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TXT TCG AAC CTA GCA AAG GAT CAA GAT GTC ACG GGT ATC ATC CAA GAC 5502 

Phe Ser Asn Leu Ala Lys Asp Gin Asp Val Thr Gly lie lie Gin Asp 

1775 1780 1785 

TTC AAT CAC CTG GAA ACT ATC TAT CTC CAA TCA GAT AGC GAA GTG GCT 5550 
Phe Asn His Leu Glu Thr He Tyr Leu Gin Ser Asp Ser Glu Val Ala 

1790 1795 1800 

AAG CAT CTG AAG CTT AAA AGT CAC TGG AAT AAA AGC CAA ATC ACT AGG 5598 
Lys His Leu Lys Leu Lys Ser His Trp Asn Lys Ser Gin He Thr Arg 
1805 1810 1815 

GAC ATC ATA ATA GCT TTG TCT GTG TTA ATT GGT GGT GGA TGG ATG CTT 5646 
Asp He He He Ala Leu Ser Val Leu He Gly Gly Gly Trp Met Leu 
1820 1825 1830 

GCA ACG TAC TTC AAG GAC AAG TTC AAT GAA CCA GTC TAT TTC CAA GGG 5694 

Ala Thr Tyr Phe Lys Asp Lys Phe Asn Glu Pro Val Tyr Phe Gin Gly 
1835 1840 1845 1850 

AAG AAG AAT CAG AAG CAC AAG CTT AAG ATG AG A GAG GCG CGT GGG GCT 5742 
Lys Lys Asn Gin Lys His Lys Leu Lys Met Arg Glu Ala Arg Gly Ala 

1855 1860 1865 



AGA 

Arg 


GGG 
Gly 


CAA 

Gin 


TAT GAG 
Tyr Glu 
1870 


GTT 
Val 


GCA 

Ala 


GCG 

Ala 


GAG 
Glu 
1875 


CCA 
Pro 


GAG 
Glu 


GCG 

Ala 


CTA 

Leu 


GAA CAT 
Glu His 
1880 


TAC 
Tyr 


5790 


TTT 
Phe 


GGA 
Gly 


AGC GCA 
Ser Ala 
1885 


TAT 
Tyr 


AAT 
Asn 


AAC 
Asn 


AAA GGA AAG 
Lys Gly Lys 
1890 


CGC 
Arg 


AAG 
Lys 


GGC ACC 
Gly Thr 
1895 


ACG 
Thr 


AGA 
Arg 


5838 


GGA 
Gly 


ATG GGT 
Met Gly 
1900 


GCA 

Ala 


AAG 
Lys 


TCT 
Ser 


CGG AAA 

Arg Lys 
1905 


TTC 
Phe 


ATA 
He 


AAC 
Asn 


ATG TAT GGG 

Met Tyr Gly 
1910 


TTT 
Phe 


GAT 
Asp 


5886 


CCA ACT 
Pro Thr 
1915 


GAT 
Asp 


TTT 
Phe 


TCA 
Ser 


TAC ATT 
Tyr He 
1920 


AGG 
Arg 


TTT 
Phe 


GTG 
Val 


GAT CCA 

Asp Pro 
1925 


TTG 
Leu 


ACA 
Thr 


GGT 
Gly 


CAC 
His 
1930 


5934 


ACT 

Thr 


ATT 

He 


GAT 
Asp 


GAG 
Glu 


TCC ACA 
Ser Thr 
1935 


AAC 
Asn 


GCA 
Ala 


CCT 

Pro 


ATT GAT 
He Asp 
1940 


TTA 
Leu 


GTG 

Val 


CAG 
Gin 


CAT GAG 
His Glu 
1945 


5982 


TTT 

Phe 


GGA 
Gly 


AAG 
Lys 


GTT AGA 

Val Arg 
1950 


ACA 
Thr 


CGC 

Arg 


ATG 
Met 


TTA ATT 
Leu He 
1955 


GAC 
Asp 


GAT 
Asp 


GAG 
Glu 


ATA GAG 
He Glu 
1960 


CCT 

Pro 


6030 


CAA 
Gin 


AGT 

Ser 


CTT AGC 

Leu Ser 
1965 


ACC 

Thr 


CAC 
His 


ACC 

Thr 


ACA ATC 

Thr He 
1970 


CAT 
His 


GCT 

Ala 


TAT 
Tyr 


TTG GTG 
Leu Val 
1975 


AAT 
Asn 


AGT 

Ser 


6078 


GGC 
Gly 


ACG AAG 
Thr Lys 
1980 


AAA 
Lys 


GTT 
Val 


CTT 
Leu 


AAG GTT 
Lys Val 
1985 


GAT 
Asp 


TTA 
Leu 


ACA 
Thr 


CCA CAC 
Pro His 
1990 


TCG 
Ser 


TCG 
Ser 


CTA 
Leu 


6126 


CGT GCG 
Arg Ala 
1995 


AGT 
Ser 


GAG 
Glu 


AAA 
Lys 


TCA ACA 
Ser Thr 
2000 


GCA 
Ala 


ATA 
He 


ATG 
Met 


GGA TTT 
Gly Phe 
2005 


CCT 
Pro 


GAA 
Glu 


AGG 
Arg 


GAG 
Glu 
2010 


6174 


AAT 
Asn 


GAA 
Glu 


TTG 
Leu 


CGT 
Arg 


CAA ACC GGC 
Gin Thr Gly 
2015 


ATG 
Met 


GCA 
Ala 


GTG CCA 
Val Pro 
2020 


GTG 
Val 


GCT 
Ala 


TAT 
Tyr 


GAT CAA 
Asp Gin 
2025 


6222 
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TTG CCA CCA AAG AAT GAG GAG TTG ACG TTT GAA GGA GAA AGO TTG TTT 6270 
Leu Pro Pro Lys Asn Glu Asp Leu Thr Phe Glu Gly Glu Ser Leu Phe 

2030 2035 2040 

AAG GGA CCA CGT GAT TAC AAC COG ATA TOG AGO AGO ATT TGT CAT TTG 6318 

Lys Gly Pro Arg Asp Tyr Asn Pro lie Ser Ser Thr lie Cys His Leu 
2045 2050 2055 

ACG AAT GAA TCT GAT GGG CAC AC A AC A TCG TTG TAT GGT ATT GGA TTT 6366 
Thr Asn Glu Ser Asp Gly His Thr Thr Ser Leu Tyr Gly lie Gly Phe 
2060 2065 2070 

GGT CCC TTC ATC ATT AC A AAC AAG CAC TTG TTT AG A AGA AAT AAT GGA 6414 
Gly Pro Phe lie lie Thr Asn Lys His Leu Phe Arg Arg Asn Asn Gly 
2075 2080 2085 2090 

ACA CTG TTG GTC CAA TCA CTA CAT GGT GTA TTC AAG GTC AAG AAC ACC 6462 

Thr Leu Leu Val Gin Ser Leu His Gly Val Phe Lys Val Lys Asn Thr 

2095 2100 2105 

ACG ACT TTG CAA CAA CAC CTC ATT GAT GGG AGG GAC ATG ATA ATT ATT 6510 
Thr Thr Leu Gin Gin His Leu lie Asp Gly Arg Asp Met lie lie lie 

2110 2115 2120 

CGC ATG CCT AAG GAT TTC CCA CCA TTT CCT CAA AAG CTG AAA TTT AGA 6558 

Arg Met Pro Lys Asp Phe Pro Pro Phe Pro Gin Lys Leu Lys Phe Arg 
2125 2130 2135 

GAG CCA CAA AGG GAA GAG CGC ATA TGT CTT GTG ACA ACC AAC TTC CAA 6606 

Glu Pro Gin Arg Glu Glu Arg lie Cys Leu Val Thr Thr Asn Phe Gin 
2140 2145 2150 

ACT AAG AGC ATG TCT AGC ATG GTG TCA GAC ACT AGT TGC ACA TTC CCT 6654 

Thr Lys Ser Met Ser Ser Met Val Ser Asp Thr Ser Cys Thr Phe Pro 
2155 2160 2165 2170 

TCA TCT GAT 6GC ATA TTC TGG AAG CAT TG6 ATT CAA ACC AAG GAT GGG 6702 

Ser Ser Asp Gly lie Phe Trp Lys His Trp lie Gin Thr Lys Asp Gly 

2175 2180 2185 

CAG TGT GGC AGT CCA TTA GTA TCA ACT AGA GAT GGG TTC ATT GTT GGT 6750 

Gin Cys Gly Ser Pro Leu Val Ser Thr Arg Asp Gly Phe lie Val Gly 

2190 2195 2200 

ATA CAC TCA GCA TCG AAT TTC ACC AAC ACA AAC AAT TAT TTC ACA AGC 6798 
lie His Ser Ala Ser Asn Phe Thr Asn Thr Asn Asn Tyr Phe Thr Ser 
2205 2210 2215 

GTG CCG AAA AAC TTC ATG GAA TTG TTG ACA AAT CAG GAG GCG CAG CAG 6846 
Val Pro Lys Asn Phe Met Glu Leu Leu Thr Asn Gin Glu Ala Gin Gin 
2220 2225 2230 

TGG GTT AGT GGT TGG CGA TTA AAT GCT GAC TCA GTA TTG TGG GGG GGC 6894 
Trp Val Ser Gly Trp Arg Leu Asn Ala Asp Ser Val Leu Trp Gly Gly 
2235 2240 2245 2250 

CAT AAA GTT TTC ATG AGC AAA CCT GAA GAG CCT TTT CAG CCA GTT AAG $942 
His Lys Val Phe Met Ser Lys Pro Glu Glu Pro Phe Gin Pro Val Lys 

2255 2260 2265 

GAA GCG ACT CAA CTC ATG AAT GAA TTG GTG TAC TCG CAA GGG GAG AAG 6990 
Glu Ala Thr Gin Leu Met Asn Glu Leu Val Tyr Ser Gin Gly Glu Lys 

2270 2275 2280 
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AG6 AAA TGG GTC GTG GAA GCA CTG TCA GGG AAC TTG AGG CCA GTG GCT 
Arg Lys Trp Val Val Glu Ala Leu Ser Gly Asn Leu Arg Pro Val Ala 
22B5 2290 2295 



7038 



GAG TGT CCC AGT CAG TTA GTC ACA AAG CAT GTG GTT AAA GGA AAG TGT 

Glu Cys Pro Ser Gin Leu Val Thr Lys His Val Val Lys Gly Lys Cys 
2300 2305 2310 

CCC CTC TTT GAG CTC TAC TTG CAG TTG AAT CCA GAA AAG GAA GCA TAT 

Pro Leu Phe Glu Leu Tyr Leu Gin Leu Asn Pro Glu Lys Glu Ala Tyr 

2315 2320 2325 2330 



7086 



7134 



TTT AAA CCG ATG ATG GGA GCA TAT AAG CCA AGT CGA CTT AAT AGA GAG 
Phe Lys Pro Met Met Gly Ala Tyr Lys Pro Ser Arg Leu Asn Arg Glu 

2335 2340 2345 



7182 



GCG TTC CTC AAG GAC ATT CTA AAA TAT GCT AGT GAA ATT GAG ATT GGG 
Ala Phe Leu Lys Asp lie Leu Lys Tyr Ala Ser Glu lie Glu lie Gly 

2350 2355 2360 



7230 



AAT GTG GAT TGT GAC TTG CTG GAG CTT GCA ATA AGC ATG CTC GTC ACA 
Asn Val Asp Cys Asp Leu Leu Glu Leu Ala lie Ser Met Leu Val Thr- 
2365 2370 2375 



7278 



AAG CTC AAG GCG TTA GGA TTC CCA ACT GTG AAC TAC ATC ACT GAC CCA 

Lys Leu Lys Ala Leu Gly Phe Pro Thr Val Asn Tyr lie Thr Asp Pro 
2380 2385 2390 



7326 



GAG GAA ATT TTT AGT GCA TTG AAT ATG AAA GCA GCT ATG GGA GCA CTA 
Glu Glu lie Phe Ser Ala Leu Asn Met Lys Ala Ala Met Gly Ala Leu 
2395 2400 2405 2410 



7374 



TAC AAA GGC AAG AAG AAA GAA GCT CTC AGC GAG CTC ACA CTA GAT GAG 
Tyr Lys Gly Lys Lys Lys Glu Ala Leu Ser Glu Leu Thr Leu Asp Glu 

2415 2420 2425 



7422 



CAG GAG GCA ATG CTC AAA GCA AGT TGC CTG CGA CTG TAT ACG GGA AAG 

Gin Glu Ala Met Leu Lys Ala Ser Cys Leu Arg Leu Tyr Thr Gly Lys 

2430 2435 2440 



7470 



TTG GGA ATT TGG AAT GGC TCA TTG AAA GCA GAG TTG CGT CCA ATT GAG 
Leu Gly lie Trp Asn Gly Ser Leu Lys Ala Glu Leu Arg Pro lie Glu 
2445 2450 2455 



7518 



AAG GTT GAA AAC AAC AAA ACG CGA ACT TTC ACA GCA GCA CCA ATA GAC 

Lys Val Glu Asn Asn Lys Thr Arg Thr Phe Thr Ala Ala Pro lie Asp 
2460 2465 2470 

ACT CTT- CTT GCT GGT AAA GTT TGC GTG GAT GAT TTC AAC AAT CAA TTT 
Thr Leu Leu Ala Gly Lys Val Cys Val Asp Asp Phe Asn Asn Gin Phe 
2475 2480 2485 2490 



7566 



7614 



TAT GAT CTC AAC ATA AAG GCA CCA TGG ACA GTT GGT ATG ACT AAG TTT 
Tyr Asp Leu Asn lie Lys Ala Pro Trp Thr Val Gly Met Thr Lys Phe 

2495 2500 2505 



7662 



TAT CAG GGG TGG AAT GAA TTG ATG GAG GCT TTA CCA AGT GGG TGG GTG 
Tyr Gin Gly Trp Asn Glu Leu Met Glu Ala Leu Pro Ser Gly Trp Val 

2510 2515 2520 



7710 



TAT TGT GAC GCT GAT GGT TCG CAA TTC GAC AGT TCC TTG ACT CCA TTC 
Tyr Cys Asp Ala Asp Gly Ser Gin Phe Asp Ser Ser Leu Thr Pro Phe 
2525 2530 2535 



7758 
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CTC ATT AAT GCT GTA TTG AAA GTG CGA CTT GCC TTC ATG GAG GAA TOG 7806 
Leu lie Asn Ala Val Leu Lys Val Arg Leu Ala Phe Met Glu Glu Trp 
2540 2545 2550 

GAT ATT GGT GAG CAA ATG CTG CGA AAT TTG TAG ACT GAG ATA GTG TAT 7854 
Asp He Gly Glu Gin ket Leu Arg Asn Leu Tyr Thr Glu He Val Tyr 
2555. 2560 2565 2570 

ACA CCA ATC CTC ACA CCG GAT GGT ACT ATC ATT AAG AAG CAT AAA GGC 7902 
Thr Pro ile Leu Thr Pro Asp Gly Thr He He Lys Lys His Lys Gly 

2575 2580 2585 

AAC AAT AGC GGG CAA CCT TCA ACA GTG GTG GAC AAC ACA CTC ATG GTC 7950 
Asn Asn Ser Gly Gin Pro Ser Thr Val Val Asp Asn Thr Leu Met Val 

2590 2595 2600 

ATT ATT GCA ATG TTA TAC ACA TGT GAG AAG TGT GGA ATC AAC AAG GAA 7998 

lie He Ala Met Leu Tyr Thr cys Glu Lys Cys Gly He Asn Lys Glu 

2605 2610 2615 

GAG ATT GTG TAT TAC GTC AAT GGC GAT GAC CTA TTG ATT GCC ATT CAC 8046 
Glu Ile Val Tyr Tyr Val Asn Gly Asp Asp Leu Leu Ile Ala Ile His 
2620 2625 2630 

CCA GAT AAA GCT GAG AGG TTG AGT AGA TTC AAA GAA TCT TTC GGA GAG 8094 
Pro Asp Lys Ala Glu Arg Leu Ser Arg Phe Lys Glu Ser Phe Gly Glu 
2635 2640 2645 2650 

TTG GGC CTG AAA TAT GAA TTT GAC TGT ACC ACC AGG GAC AAG ACA CAG 8142 

Leu Gly Leu Lys Tyr Glu Phe Asp Cys Thr Thr Arg Asp Lys Thr Gin 

2655 2660 2665 

TTG TGG TTC ATG TCA CAC AGG GCT TTG GAG AGG GAT GGC ATG TAT ATA 8190 
Leu Trp Phe Met Ser His Arg Ala Leu Glu Arg Asp Gly Met Tyr lie 

2670 2675 2680 

CCA AAG CTA GAA GAA GAA AGG ATT GTT TCT ATT TTG GAA TGG GAC AGA 8238 

Pro Lys Leu Glu Glu Glu Arg Ile Val Ser lie Leu Glu Trp Asp Arg 
2685 2690 2695 

TCC AAA GAG CCG TCA CAT AGG CTT GAA GCC ATC TGT GCA TCA ATG ATT 8286 
Ser Lys Glu Pro Ser His Arg Leu Glu Ala lie Cys Ala Ser Met He 
2700 2705 2710 

GAA GCA TGG GGT TAT GAC AAG CTG GTT GAA GAA ATC CGC AAT TTC TAT 8334 
Glu Ala Trp Gly Tyr Asp Lys Leu Val Glu Glu He Arg Asn Phe Tyr 
2715 2720 2725 2730 

GCA TGG GTT TTG GAA CAA GCG CCG TAT TCA CAG CTT GCA GAA GAA GGA 8382 
Ala Trp Val Leu Glu Gin Ala Pro Tyr Ser Gin Leu Ala Glu Glu Gly 

2735 2740 2745 

AAG GCG CCA TAT CTG GCT GAG ACT GCG CTT AAG TTT TTG TAC ACA TCT 8430 
Lys Ala Pro Tyr Leu Ala Glu Thr Ala Leu Lys Phe Leu Tyr Thr Ser 

2750 2755 2760 

CAG CAC GGA ACA AAC TCT GAG ATA GAA GAG TAT TTA AAA GTG TTG TAT 8478 
Gin His Gly Thr Asn Ser Glu He Glu Glu Tyr Leu Lys Val Leu Tyr 
2765 2770 2775 

GAT TAC GAT ATT CCA ACG ACT GAG AAT CTT TAT TTT CAG AGT GGC ACT 8526 
Asp Tyr Asp lie Pro Thr Thr Glu Asn Leu Tyr Phe Gin Ser Gly Thr 
2780 2785 2790 
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GTG GAT GCT GGT GOT GAC GCT GGT AAG AAG AAA GAT CAA AAG GAT GAT 8574 
Val Asp Ala Gly Ala Asp Ala Gly Lys Lys Lys Asp Gin Lys Asp Asp 
2795 2800 2805 2810 

AAA GTC GCT GAG CAG GCT TCA AAG GAT AGG GAT GTT AAT GCT GGA ACT 8622 
Lys Val Ala Glu Gin Ala Ser Lys Asp Arg Asp Val Asn Ala Gly Thr 

2815 2820 2825 

TCA GGA ACA TTC TCA GTT CCA CGA ATA AAT GCT ATG GCC ACA AAA CTT 8670 
Ser Gly Thr Phe Ser Val Pro Arg lie Asn Ala Met Ala Thr Lys Leu 

2830 2835 2840 

CAA TAT CCA AGG ATG AGG GGA GAG GTG GTT GTA AAC TTG AAT CAC CTT 8718 
Gin Tyr Pro Arg Met Arg Gly Glu Val Val Val Asn Leu Asn His Leu 
2845 2850 2855 

TTA GGA TAC AAG CCA CAG CAA ATT GAT TTG TCA AAT GCT CGA GCC ACA 87.66 
Leu Gly Tyr Lys Pro Gin Gin lie Asp Leu Ser Asn Ala Arg Ala Thr 
2860 2865 2870 

CAT GAG CAG TTT GCC GCG TGG CAT CAG GCA GTG ATG ACA GCC TAT GGA 8814 
His Glu Gin Phe Ala Ala Trp His Gin Ala Val Met Thr Ala Tyr Gly 
2875 2880 2885 2890 

GTG AAT 6AA GAG CAA ATG AAA ATA TTG CTA AAT GGA TTT ATG GTG TGG 8862 
Val Asn Glu Glu Gin Met Lys lie Leu Leu Asn Gly Phe Met Val Trp 

2895 2900 2905 

T6C ATA GAA AAT GGG ACT TCC CCA AAT TTG AAC GGA ACT TGG GTT ATG 8910 
Cys lie Glu Asn Gly Thr Ser Pro Asn Leu Asn Gly Thr Trp Val Met 

2910 2915 2920 

ATG GAT GGT GAG GAT CAA GTT TCA TAC CCG CTG AAA CCA ATG GTT GAA 8958 
Met Asp Gly Glu Asp Gin Val Ser Tyr Pro Leu Lys Pro Met Val Glu 
2925 2930 2935 

AAC GCG CAG CCA ACA CTG AGG CAA ATT ATG ACA CAC TTC AGT GAC CTG 9006 

Asn Ala Gin Pro Thr Leu Arg Gin lie Met Thr His Phe Ser Asp Leu 
2940 2945 2950 

GCT GAA GCG TAT ATT GAG ATG AGG AAT AGG GAG CGA CCA TAC ATG CCT 9054 
Ala Glu Ala Tyr lie Glu Met Arg Asn Arg Glu Arg Pro Tyr Met Pro 
2955 2960 2965 2970 

AGG TAT GGT CTA CAG AGA AAC ATT ACA GAC ATG AGT TTG TCA CGC TAT 9102 
Arg Tyr Gly Leu Gin Arg Asn lie Thr Asp Met Ser Leu Ser Arg Tyr 

2975 2980 2985 

GCG TTC GAC TTC TAT GAG CTA ACT TCA AAA ACA CCT GTT AGA GCG AGG 9150 

Ala Phe Asp Phe Tyr Glu Leu Thr Ser Lys Thr Pro Val Arg Ala Arg 

2990 2995 3000 

GAG GCG CAT ATG CAA ATG AAA GCT GCT GCA GTA CGA AAC AGT GGA ACT 9198 

Glu Ala His Met Gin Met Lys Ala Ala Ala Val Arg Asn Ser Gly Thr 
3005 3010 3015 

AGG TTA TTT GGT CTT GAT GGC AAC GTG GGT ACT GCA GAG GAA GAC ACT 9246 
Arg Leu Phe Gly Leu Asp Gly Asn Val Gly Thr Ala Glu Glu Asp Thr 
3020 3025 3030 

GAA CGG CAC ACA GCG CAC GAT GTG AAC CGT AAC ATG CAC ACA CTA TTA 9294 
Glu Arg His Thr Ala His Asp Val Asn Arg Asn Met His Thr Leu Leu 
3035 3040 3045 3050 



FIG. 1 



SUBSTTTUTE SHEET 



wo 93/17098 



13/16 



PCT/US93/01S44 



GGG GTC CGC CAG TGA TAGTTTCTGC GTGTCTTTGC TTTCCGCTTT TAAGCTTATT 9349 

Gly Val Arg Gin 

GTAATATATA TGAATAGCTA TTCACAGTGG GACTTGGTCT TGTGTTGAAT AGTATCTTAT 9409 

ATATTTTAAT ATGTCTTATT AGTCTCATTA CTTAGGCGAA CGACAAAGTG AGGTCACCTC 9469 

GGTCTAATTC TCCTATGTAG TGCGAG 9495 
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